Skip to content
yomi

Reading a page

Turn one URL into clean Markdown with yomi read: render modes, front-matter, the title heading, images, and the meta and links subcommands.

yomi read is the core command: one URL in, clean Markdown out. By default it prints to stdout, so it pipes and redirects like any Unix tool.

yomi read paulgraham.com/greatwork.html
yomi read paulgraham.com/greatwork.html -o greatwork.md

-o/--out writes to a file instead of stdout.

Render modes

The default mode is --render auto. yomi fetches the page with a plain HTTP request first and only escalates to headless Chrome when the page looks JavaScript-gated: an empty single-page-app mount like #root, #__next, or #app, a <noscript> block saying JavaScript is required, or under 25 words of visible text. A page that already arrived as readable HTML is never sent to the browser.

# Default: static fetch, render only when the page needs it
yomi read example.com

# Always render in headless Chrome
yomi read example.com --render on

# Never launch a browser; read whatever the static HTML gives
yomi read example.com --render off

Use --render on for a site you already know is a single-page app, and --render off when you want to stay fast and you know the page is static, or when there is no browser available.

If a page lazy-loads content as you scroll, add --scroll so the render path scrolls the page before snapshotting:

yomi read example.com --render on --scroll

Front-matter

Every Markdown file opens with a YAML front-matter block carrying the metadata yomi read from the page. The fields appear in a fixed order, and only non-empty ones are written:

---
title: How to Do Great Work
url: https://paulgraham.com/greatwork.html
site: Paul Graham
byline: Paul Graham
published: 2023-07-01
fetched: 2026-06-17T09:30:00Z
lang: en
word_count: 13500
reading_time: 54 min
---

To omit the header entirely and emit only the body:

yomi read example.com --no-front-matter

The title as a heading

By default the title lives in the front-matter, not in the body. To keep it as an H1 at the top of the Markdown body as well:

yomi read example.com --title-heading

Wrapping prose

By default prose is left unwrapped, one paragraph per line. To hard-wrap at a column:

yomi read example.com --wrap 80

--wrap 0 (the default) means no wrapping.

Links are written inline by default ([text](url)). To collect them as reference definitions at the bottom of the document instead:

yomi read example.com --links reference

Images

By default image URLs are left absolute, pointing at the live web (--images remote). To download each image next to the output and rewrite to a relative path, or to embed images as base64 data URIs for a self-contained file:

# Download images into a sidecar folder
yomi read example.com -o page.md --images download

# Embed images inline as data URIs
yomi read example.com -o page.md --images inline

For a single read, --images download writes images into a <name>.media/ sidecar folder next to the output file. The images guide covers all three policies and the size cap.

Just the metadata

yomi meta prints the page's metadata record as JSON and skips the Markdown body entirely:

yomi meta paulgraham.com/greatwork.html
{
  "title": "How to Do Great Work",
  "byline": "Paul Graham",
  "site": "Paul Graham",
  "language": "en",
  "word_count": 13500,
  "reading_time": "54 min",
  "links": 42,
  "images": 3
}

This is handy for scripting: feed a list of URLs through yomi meta and you get a structured row per page without converting any prose.

yomi links lists the outbound links found in the page's article body, one URL per line:

yomi links paulgraham.com/greatwork.html

Add --json for a structured list. Because the links come from the extracted article body and not the whole page, you get the links the author wrote, not the nav and footer links around them.

All the shared read flags (--render, --scroll, --timeout, --user-agent, --chrome, and the rest) apply to meta and links too, since both have to fetch and extract the page before they can report on it.