mirror of https://github.com/Significant-Gravitas/AutoGPT.git synced 2026-02-09 14:25:25 -05:00

Files

Nicholas Tindle c1a1767034 feat(docs): Add block documentation auto-generation system (#11707 )

- Add generate_block_docs.py script that introspects block code to
generate markdown
- Support manual content preservation via <!-- MANUAL: --> markers
- Add migrate_block_docs.py to preserve existing manual content from git
HEAD
- Add CI workflow (docs-block-sync.yml) to fail if docs drift from code
- Add Claude PR review workflow (docs-claude-review.yml) for doc changes
- Add manual LLM enhancement workflow (docs-enhance.yml)
- Add GitBook configuration (.gitbook.yaml, SUMMARY.md)
- Fix non-deterministic category ordering (categories is a set)
- Add comprehensive test suite (32 tests)
- Generate docs for 444 blocks with 66 preserved manual sections

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

<!-- Clearly explain the need for these changes: -->

### Changes 🏗️

<!-- Concisely describe all of the changes made in this pull request:
-->

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
  - [x] Extensively test code generation for the docs pages



<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Introduces an automated documentation pipeline for blocks and
integrates it into CI.
> 
> - Adds `scripts/generate_block_docs.py` (+ tests) to introspect blocks
and generate `docs/integrations/**`, preserving `<!-- MANUAL: -->`
sections
> - New CI workflows: **docs-block-sync** (fails if docs drift),
**docs-claude-review** (AI review for block/docs PRs), and
**docs-enhance** (optional LLM improvements)
> - Updates existing Claude workflows to use `CLAUDE_CODE_OAUTH_TOKEN`
instead of `ANTHROPIC_API_KEY`
> - Improves numerous block descriptions/typos and links across backend
blocks to standardize docs output
> - Commits initial generated docs including
`docs/integrations/README.md` and many provider/category pages
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
631e53e0f6. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-19 07:03:19 +00:00

2.6 KiB

Raw Blame History

Exa Contents

Blocks for retrieving and extracting content from web pages using Exa's contents API.

Exa Contents

What it is

Retrieves document contents using Exa's contents API

How it works

This block retrieves full content from web pages using Exa's contents API. You can provide URLs directly or document IDs from previous searches. The API supports live crawling to fetch fresh content and can extract text, highlights, and AI-generated summaries.

The block supports subpage crawling to gather related content and offers various content retrieval options including full text extraction, relevant highlights, and customizable summary generation. Results are formatted for easy use with LLMs.

Inputs

Input	Description	Type	Required
urls	Array of URLs to crawl (preferred over 'ids')	List[str]	No
ids	[DEPRECATED - use 'urls' instead] Array of document IDs obtained from searches	List[str]	No
text	Retrieve text content from pages	bool	No
highlights	Text snippets most relevant from each page	HighlightSettings	No
summary	LLM-generated summary of the webpage	SummarySettings	No
livecrawl	Livecrawling options: never, fallback (default), always, preferred	"never" \| "fallback" \| "always" \| "preferred"	No
livecrawl_timeout	Timeout for livecrawling in milliseconds	int	No
subpages	Number of subpages to crawl	int	No
subpage_target	Keyword(s) to find specific subpages of search results	str \| List[str]	No
extras	Extra parameters for additional content	ExtrasSettings	No

Outputs

Output	Description	Type
error	Error message if the request failed	str
results	List of document contents with metadata	List[ExaSearchResults]
result	Single document content result	ExaSearchResults
context	A formatted string of the results ready for LLMs	str
request_id	Unique identifier for the request	str
statuses	Status information for each requested URL	List[ContentStatus]
cost_dollars	Cost breakdown for the request	CostDollars

Possible use case

Content Aggregation: Retrieve full article content from multiple URLs for analysis or summarization.

Competitive Research: Crawl competitor websites to extract product information, pricing, or feature details.

Data Enrichment: Fetch detailed content from URLs discovered through Exa searches to build comprehensive datasets.

2.6 KiB Raw Blame History

Exa Contents

Exa Contents

What it is

How it works

Inputs

Outputs

Possible use case

2.6 KiB

Raw Blame History