← notes

openports v2 in two evenings: search, history, telegram, cron

What openports does

It indexes publicly exposed ComfyUI (8188) and Ollama (11434) endpoints discovered through Shodan + scrapers, verifies + fingerprints them itself, and stores the result. The idea is a temporal library, not a one-shot snapshot — the same IPs get re-checked on a cron, model lists get diffed, gone-down events get noted.

v1 worked but felt like a SQL admin panel. v2 was about turning the data into something queryable and the tool into something I'd actually open without thinking.

Round 1: visible data

The tables had everything (vram, model count, max params, ctx, country) but the UI showed three columns. Round 1 was just exposing what already existed:

  • service tabs (comfyui / ollama / all) instead of an unused dropdown
  • debounced search across ip, gpu, version, host, reverse-dns
  • model-name search that casts the JSON column to text and LIKEs
  • GPU and country dropdowns built from /api/instances/distinct/<field>
  • sortable columns on vram / models / max params / ctx / last seen
  • per-row "open in new tab" + "copy URL"
  • CSV export with the same filters
  • stats bar (total / by-service / new-24h / new-7d / stale > 24h)

That's the bulk of v2's user-facing value, and most of it was wiring already-collected data into UI affordances. The backend grew about 200 lines for new query params and a /api/stats aggregator.

Round 2: shareable state

The biggest UX win was URL-sync filters. Every filter and pagination cursor lives in the query string:

/?service=comfyui&gpu=4090&country=Germany&min_vram=24&sort=max_model_params&dir=desc&page=2

That makes the URL the source of truth: reload safe, deep-linkable, copy-paste shareable. Filter inputs are debounced locally so typing doesn't churn browser history.

Click-to-filter on the row cells (provider chip, country, gpu name, service badge) folded the filter UX into the data itself. No separate "filter by country" step — just click the country.

Round 3: stuff I actually use

These are the features that move the tool from "handy" to "where I look first":

  • Watch list. Per-row star, kept in localStorage. "★ starred" chip filters the list down. Doesn't need a database.
  • Saved searches. Name + URL snapshot in localStorage. Click to load a view, × to delete.
  • Density toggle + keyboard shortcuts (/ focus search, m model search, r refresh, a toggle alive-only, c compact rows, ? help). Once these are in muscle memory, filtering is a one-handed operation.
  • Stale chip + recheck button that triggers POST /api/scan/recheck, which re-fingerprints stored instances older than RECHECK_STALE_AFTER_MINUTES.

The Telegram bot grew up

The bot used to support /scan and /status. v2 expanded it:

  • /help lists everything
  • /top [n] — top n alive by VRAM
  • /find gpu|model|country <value> — substring queries
  • /recheck [n] [force] [alive] — re-fingerprint stored instances
  • /status shows scheduler intervals + stale count

Telegram is genuinely a faster surface than the dashboard for "is the box I rented yesterday still up?" or "find me a 4090 in DE." Putting the same query logic behind both is a 70-line commands.py.

Cron + recheck

Two APScheduler jobs gated on env vars:

SCAN_INTERVAL_MINUTES=120         # 0 disables
RECHECK_INTERVAL_MINUTES=60
RECHECK_STALE_AFTER_MINUTES=60    # skip checks fresher than this

The recheck job re-fingerprints in batches and only touches instances older than the stale threshold, so concurrent operators (the bot, the cron, and a manual button click) don't all re-poll the same boxes.

The bug that almost shipped

Cast-to-text on a JSON column for model search:

stmt = stmt.where(
    func.lower(func.cast(Instance.models, text("TEXT"))).like(like)
)

That throws 'TextClause' object has no attribute '_static_cache_key' because cast() wants a SQLAlchemy TypeEngine, not a text() clause. Easy fix:

from sqlalchemy import Text, cast
stmt = stmt.where(
    func.lower(cast(Instance.models, Text)).like(like)
)

Caught it in a manual smoke pass over the new endpoints with curl. Lesson: when adding a query param, run it once before claiming done. Type checks won't save you here.

What's next

Round 4: per-instance history table + diff detection (so we can answer "this RTX 5090 box came online tuesday, picked up qwen2.5-32b on thursday, went down sunday"). Plus a /models catalog page aggregating every unique model seen across all instances, with counts.

Repo: safzanpirani/openports.