# AGENTS.md — Tech Daily

> Agent integration guide for Tech Daily. This is **not** a developer doc for the underlying coil platform — it's a runtime guide for AI agents (browse-on-behalf, search/answer engines, native MCP clients) helping a human listener consume this podcast.

## Cheapest path per capability

| Listener intent | Endpoint |
|---|---|
| "What's the latest episode?" | `GET https://tech-daily-cao.pages.dev/?mode=agent` (returns `latestEpisode`) |
| "Find the episode about <X>" | `GET https://tech-daily-cao.pages.dev/api/search?q=<X>` |
| "Ask the show a question" | `POST https://tech-daily-cao.pages.dev/ask` (NLWeb; SSE via `Accept: text/event-stream`) |
| "Subscribe me" | RSS: https://tech-daily-cao.pages.dev/rss.xml |
| "Read the transcript of episode N" | `GET https://tech-daily-cao.pages.dev/<N>.md` (markdown) or `GET https://tech-daily-cao.pages.dev/sNNeMM.txt` (raw) |
| "Browse the catalog" | `GET https://tech-daily-cao.pages.dev/episodes.json` or `GET https://tech-daily-cao.pages.dev/episodes/llms.txt` |
| Health check / circuit-breaker | `GET https://tech-daily-cao.pages.dev/status` |
| Native MCP tool calls | `POST https://tech-daily-cao.pages.dev/mcp` (Streamable HTTP, JSON-RPC 2.0) |
| MCP server preview before connect | `GET https://tech-daily-cao.pages.dev/.well-known/mcp/server-card.json` |

## Rate limits

- **60 req/min/IP** across all API endpoints. Self-throttle on `X-RateLimit-Remaining` / `Retry-After`.
- All API responses include `X-RateLimit-Limit`, `X-RateLimit-Remaining`, `X-RateLimit-Reset` (Unix seconds).

## Errors

Structured JSON envelope: `{ error: { code, message, hint, docs_url } }`.
Status codes used: **400** (bad query/body), **404** (no such episode), **405** (wrong method), **429** (rate-limited), **500** (server side).
Episode-not-found via `?mode=agent` or `Accept: application/json` returns a real 404 + JSON envelope (browsers still get a 301 to home).

## Discovery surfaces

- **llms.txt:** [/llms.txt](https://tech-daily-cao.pages.dev/llms.txt), [/episodes/llms.txt](https://tech-daily-cao.pages.dev/episodes/llms.txt), [/api/llms.txt](https://tech-daily-cao.pages.dev/api/llms.txt), [/.well-known/llms.txt](https://tech-daily-cao.pages.dev/.well-known/llms.txt)
- **agent.json:** [/.well-known/agent.json](https://tech-daily-cao.pages.dev/.well-known/agent.json) — capability declaration + endpoint inventory
- **agent-card.json:** [/.well-known/agent-card.json](https://tech-daily-cao.pages.dev/.well-known/agent-card.json) — A2A-style skill card
- **agent-skills:** [/.well-known/agent-skills/index.json](https://tech-daily-cao.pages.dev/.well-known/agent-skills/index.json) — agentskills.io v0.2.0 (SKILL.md artifacts with sha256)
- **MCP discovery (all return the same manifest):** [/.well-known/mcp](https://tech-daily-cao.pages.dev/.well-known/mcp), [/.well-known/mcp.json](https://tech-daily-cao.pages.dev/.well-known/mcp.json), [/.well-known/mcp-configuration](https://tech-daily-cao.pages.dev/.well-known/mcp-configuration), [/.well-known/mcp/server.json](https://tech-daily-cao.pages.dev/.well-known/mcp/server.json)
- **OpenAPI 3.1:** [/.well-known/openapi.json](https://tech-daily-cao.pages.dev/.well-known/openapi.json)
- **Schema map (NLWeb):** [/.well-known/schema-map.xml](https://tech-daily-cao.pages.dev/.well-known/schema-map.xml)
- **Sitemap:** [/sitemap.xml](https://tech-daily-cao.pages.dev/sitemap.xml)
- **HTTP Link headers (RFC 8288):** every HTML response advertises sitemap, markdown alternates, OpenAPI, agent.json, agent-card, agent-skills, MCP, RSS, and llms.txt.

## Modes & negotiation

- `?mode=agent` on `/` or `/<id>` → compact JSON envelope
- `/<id>.md` or `Accept: text/markdown` → markdown view of episode (or homepage)
- `Accept: application/json` is **not** required — the JSON forms are URL-addressable

## Crawl policy

Runtime browse-on-behalf bots (ChatGPT-User, OAI-SearchBot, PerplexityBot, Claude-User, Applebot, etc.) are **always allowed**, regardless of the show's training-opt-in setting. Training crawlers are gated on `ai_training` in the show config — see `/robots.txt` for the live policy.

## Identity

- Host: Michael Lugassy
- Language: en
- Cadence: daily
- Site: https://tech-daily-cao.pages.dev

## Things not to do

- Don't scrape rendered HTML when a structured endpoint exists. Every piece of metadata is one fetch away in JSON or markdown.
- Don't fetch the SPA bundle to extract content — `/index.md` and `/<id>.md` are both faster and stable.
- Don't paginate `/api/search` past `limit=50` — that's the hard cap.
