Project Minato
Scrapers

Community Scrapers

Write a Minato scraper in TypeScript and publish it from a Git repository.

A community scraper is a TypeScript module that Minato loads and executes on your server. If you need native binaries, non-npm languages, or full Docker lifecycle control, use a sidecar instead.

Quick start

Two files are all you need.

my-scraper/
  package.json
  index.ts

package.json

{
  "name": "my-scraper",
  "version": "1.0.0",
  "title": "My Scraper",
  "description": "One-line description of what your scraper indexes",
  "author": "yourhandle",
  "repository": "https://github.com/you/my-scraper",
  "module": "index.ts",
  "type": "module",
  "minato": {
    "capabilities": ["ingest", "status", "commands"]
  }
}

index.ts

import { defineScheduledScraper } from "@minato/skit";

export default defineScheduledScraper({
  recommendedSchedule: "0 */6 * * *",
  async run({ ingest, signal, status, meta }) {
    status.update({ phase: "running" });

    const torrents = await fetchTorrents();
    for (const t of torrents) {
      if (signal.aborted) break;
      ingest.add({
        infoHash: t.hash,
        title: t.name,
        size: t.bytes,
        source: { name: meta.name, url: t.url },
      });
    }

    status.update({ phase: "idle" });
  },
});

Lifecycle

A scraper is either scheduled or a daemon. The factory function you export determines which.

Scheduled

run() does one complete pass and returns. Minato treats the return as completion and schedules the next run. recommendedSchedule is your suggested default (5-field UTC cron). The admin can change it per-instance without touching your code.

export default defineScheduledScraper({
  recommendedSchedule: "0 */6 * * *",
  async run({ ingest }) {
    const results = await fetchAllPages();
    for (const r of results) ingest.add(r);
  },
});

Write run() as a single pass. Don't implement a repeat loop inside it.

Daemon

For scrapers that can't complete in one shot: DHT nodes, WebSocket feeds, IRC bots. Minato keeps run() alive indefinitely and restarts it on crash. When signal aborts, clean up and return.

export default defineDaemonScraper({
  async run({ ingest, signal }) {
    const socket = await openConnection();
    socket.on("data", (chunk) => ingest.add(parseTorrent(chunk)));
    await new Promise<void>((resolve) =>
      signal.addEventListener("abort", resolve),
    );
    socket.close();
  },
});

Config

Put user-editable values in minato.defaultConfig. They appear as configurable fields in the dashboard and persist across scraper updates. Your scraper receives them already merged; dashboard values always override code defaults.

Mirror those values in the config field of your definition to get TypeScript types and a fallback for when no override exists.

{
  "minato": {
    "defaultConfig": {
      "maxPages": 100,
      "delayMs": 250
    }
  }
}
export default defineScheduledScraper<{ maxPages: number; delayMs: number }>({
  config: { maxPages: 100, delayMs: 250 },

  async run({ config }) {
    for (let page = 1; page <= config.maxPages; page++) {
      await Bun.sleep(config.delayMs);
      // ...
    }
  },
});

Keep defaultConfig flat: strings, numbers, booleans. Nested objects aren't editable from the dashboard.


Pause and stop

If you declare "commands" in capabilities, signal is an AbortSignal that fires when the user sends pause or stop from the dashboard. Check it at loop boundaries:

for (const page of pages) {
  if (signal.aborted) break;
  // ...
}

Buffered torrents flush automatically before the process exits. Paused scrapers can be resumed on demand; stopped ones wait for their next scheduled run.


Packaging

Add a dependencies key for any npm packages you need. Minato installs them on first install and each update. Ship source only. No build step.

{
  "dependencies": { "cheerio": "^1.0.0" }
}

Scrapers run under Bun, which gives access to Bun APIs (Bun.sleep, Bun.file, etc.) alongside the full Node.js-compatible standard library.


package.json reference

The scraper's ID and version come from the top-level name and version fields.

Top-level fields:

FieldRequiredDescription
nameUnique ID — kebab-case, used as the scraper's internal identifier
titleDisplay name shown in the dashboard and registry
descriptionShort description shown in the registry listing
authorYour handle or name
repositorySource URL — where users file issues and find your code
moduleEntrypoint path, relative to the package root

minato block fields:

FieldRequiredDescription
minato.capabilitiesSee below
minato.defaultConfigUser-editable config values

Capabilities (declare only what your scraper uses):

ValueWhat it unlocks
"ingest"Required. Posting torrents to Minato.
"status"Progress bar and phase display in the dashboard
"commands"Pause / stop / resume from the dashboard

Publishing

Push to any public Git host. Users install by pasting the URL into Settings → Scrapers → Install from URL.

To get listed in the one-click registry, open a pull request to minato-registry.

For the full runtime API, see Skit SDK.

On this page