Skip to content
yomi

CLI reference

Every yomi command and flag.

yomi [command] [flags]

Five commands: read turns one page into Markdown, site reads a whole site, meta prints a page's metadata as JSON, links lists a page's outbound links, and serve previews a folder of Markdown. Run yomi <command> --help for the canonical, up-to-date list.

The read flags in the last section are shared: they apply to read, site, meta, and links, since each fetches and extracts a page before doing its job.

yomi read

yomi read <url> [flags]

Reads one page into Markdown, printing to stdout or writing to a file. Fetches the page, renders it in headless Chrome only when needed, extracts the main content, and converts it to GitHub-Flavored Markdown with a YAML front-matter header.

Flag Default Meaning
-o, --out stdout Write the Markdown to a file instead of stdout

Plus the shared read flags.

yomi site

yomi site <url> [flags]

Crawls a whole site breadth-first from the seed URL, reading each in-scope page into Markdown. The default output is a folder of .md files mirroring the URL paths, with a SUMMARY.md table of contents and a shared media/ folder. --single assembles one combined file instead.

Output

Flag Default Meaning
-o, --out host name Output folder, or file path with --single
-s, --single false Assemble one combined .md file with a table of contents and per-page sections

Scope

Flag Default Meaning
-p, --max-pages 0 Stop after N pages (0 = unlimited)
-d, --max-depth 0 Link-follow depth cap (0 = unlimited)
--subdomains false Treat subdomains of the seed host as in scope
--scope-prefix Only crawl pages whose path starts with this prefix
--exclude Path prefixes to skip (repeatable)
--no-robots false Ignore robots.txt

Concurrency

Flag Default Meaning
--workers 4 Concurrent page workers

Plus the shared read flags, applied to every page in the crawl.

yomi meta

yomi meta <url> [flags]

Prints the page's metadata record as JSON (title, byline, site, language, word_count, reading_time, links, images), without the Markdown body. Takes the shared read flags.

yomi links <url> [flags]

Lists the outbound links found in the page's article body, one URL per line.

Flag Default Meaning
--json false Emit the links as JSON instead of one URL per line

Plus the shared read flags.

yomi serve

yomi serve [dir] [flags]

Runs a local static file server over a folder of Markdown. With no dir, serves the current directory.

Flag Default Meaning
-a, --addr 127.0.0.1:8800 Address to listen on

Shared read flags

These apply to read, site, meta, and links.

Fetching and rendering

Flag Default Meaning
--render auto auto static-fetches first and renders in headless Chrome only when the page looks JavaScript-gated; on always renders; off never launches a browser
--scroll false Auto-scroll the page in render mode to trigger lazy loading
--timeout 30s Per-request timeout
--user-agent User-Agent for fetches
--chrome Path to the Chrome/Chromium binary
--control-url Attach to a running Chrome DevTools endpoint

Extraction and output

Flag Default Meaning
--links inline Link style: inline ([text](url)) or reference (definitions at the bottom)
--no-front-matter false Omit the YAML front-matter header
--title-heading false Keep the title as an H1 at the top of the body
--wrap 0 Hard-wrap prose at column N (0 = no wrap)
-q, --quiet false Suppress progress output

Images

Flag Default Meaning
--images remote remote leaves image URLs absolute; download fetches images next to the output and rewrites to relative paths; inline embeds images as base64 data URIs
--max-image-mb 16 Skip images larger than this; leave them at their remote URL

Front-matter fields

read and site write a YAML front-matter header on each Markdown file. The fields appear in this fixed order, and only non-empty ones are written:

Field Meaning
title Page title
url Source URL
site Site name
byline Author
published Published date
fetched When yomi read the page
lang Content language
word_count Words in the extracted article
reading_time Estimated reading time