Files
AutoGPT/docs/integrations/exa/contents.md
Nicholas Tindle c1a1767034 feat(docs): Add block documentation auto-generation system (#11707)
- Add generate_block_docs.py script that introspects block code to
generate markdown
- Support manual content preservation via <!-- MANUAL: --> markers
- Add migrate_block_docs.py to preserve existing manual content from git
HEAD
- Add CI workflow (docs-block-sync.yml) to fail if docs drift from code
- Add Claude PR review workflow (docs-claude-review.yml) for doc changes
- Add manual LLM enhancement workflow (docs-enhance.yml)
- Add GitBook configuration (.gitbook.yaml, SUMMARY.md)
- Fix non-deterministic category ordering (categories is a set)
- Add comprehensive test suite (32 tests)
- Generate docs for 444 blocks with 66 preserved manual sections

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

<!-- Clearly explain the need for these changes: -->

### Changes 🏗️

<!-- Concisely describe all of the changes made in this pull request:
-->

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
  - [x] Extensively test code generation for the docs pages



<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Introduces an automated documentation pipeline for blocks and
integrates it into CI.
> 
> - Adds `scripts/generate_block_docs.py` (+ tests) to introspect blocks
and generate `docs/integrations/**`, preserving `<!-- MANUAL: -->`
sections
> - New CI workflows: **docs-block-sync** (fails if docs drift),
**docs-claude-review** (AI review for block/docs PRs), and
**docs-enhance** (optional LLM improvements)
> - Updates existing Claude workflows to use `CLAUDE_CODE_OAUTH_TOKEN`
instead of `ANTHROPIC_API_KEY`
> - Improves numerous block descriptions/typos and links across backend
blocks to standardize docs output
> - Commits initial generated docs including
`docs/integrations/README.md` and many provider/category pages
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
631e53e0f6. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 07:03:19 +00:00

2.6 KiB

Exa Contents

Blocks for retrieving and extracting content from web pages using Exa's contents API.

Exa Contents

What it is

Retrieves document contents using Exa's contents API

How it works

This block retrieves full content from web pages using Exa's contents API. You can provide URLs directly or document IDs from previous searches. The API supports live crawling to fetch fresh content and can extract text, highlights, and AI-generated summaries.

The block supports subpage crawling to gather related content and offers various content retrieval options including full text extraction, relevant highlights, and customizable summary generation. Results are formatted for easy use with LLMs.

Inputs

Input Description Type Required
urls Array of URLs to crawl (preferred over 'ids') List[str] No
ids [DEPRECATED - use 'urls' instead] Array of document IDs obtained from searches List[str] No
text Retrieve text content from pages bool No
highlights Text snippets most relevant from each page HighlightSettings No
summary LLM-generated summary of the webpage SummarySettings No
livecrawl Livecrawling options: never, fallback (default), always, preferred "never" | "fallback" | "always" | "preferred" No
livecrawl_timeout Timeout for livecrawling in milliseconds int No
subpages Number of subpages to crawl int No
subpage_target Keyword(s) to find specific subpages of search results str | List[str] No
extras Extra parameters for additional content ExtrasSettings No

Outputs

Output Description Type
error Error message if the request failed str
results List of document contents with metadata List[ExaSearchResults]
result Single document content result ExaSearchResults
context A formatted string of the results ready for LLMs str
request_id Unique identifier for the request str
statuses Status information for each requested URL List[ContentStatus]
cost_dollars Cost breakdown for the request CostDollars

Possible use case

Content Aggregation: Retrieve full article content from multiple URLs for analysis or summarization.

Competitive Research: Crawl competitor websites to extract product information, pricing, or feature details.

Data Enrichment: Fetch detailed content from URLs discovered through Exa searches to build comprehensive datasets.