Scraping an Arris cable modem status page

Screenshot of the modem status page with arrows pointing to a screenshot of the data's final form in Grafana.
This project’s purpose is to start with a status page and end with Grafana graphs and alerts.

It felt good to complete this project that’s been on my list for quite some time. The main goal was to scrape the values from my modem’s status page and pipe them into InfluxDB, which feeds Grafana. Not only could I look at data trends, but I could receive alerts if certain values exceeded an acceptable threshold.

Overall this is a straightforward process:

  1. Pull in the HTML from the status page (which happens to not need any authentication, making it even easier)
  2. Parse the tables we care about (Downstream and Upstream) using XPaths
  3. Munge the data into something suitable for InfluxDB
  4. Insert the data into InfluxDB
  5. Query the InfluxDB data from Grafana

I knew I wanted to use Python for the project, so I first looked into Scrapy. After wrapping my mind around it (somewhat) I gave it a go and actually had a working solution… but it felt way over engineered and at times inflexible for what I wanted. I threw 90% of that solution away and went with a simpler script.

What I landed on was something that’s custom and lightweight, but extendable in case someone has a different status page or wants to use an alternative to InfluxDB.

Grafana screenshot showing that a fluctuation in downstream power around 10:00 a.m. caused the "Correcteds" values to spike.
I’ve had it running for a day and I’m already seeing interesting data!

See the repo on GitHub!