mirror of
https://github.com/Significant-Gravitas/AutoGPT.git
synced 2026-04-08 03:00:28 -04:00
- Add generate_block_docs.py script that introspects block code to
generate markdown
- Support manual content preservation via <!-- MANUAL: --> markers
- Add migrate_block_docs.py to preserve existing manual content from git
HEAD
- Add CI workflow (docs-block-sync.yml) to fail if docs drift from code
- Add Claude PR review workflow (docs-claude-review.yml) for doc changes
- Add manual LLM enhancement workflow (docs-enhance.yml)
- Add GitBook configuration (.gitbook.yaml, SUMMARY.md)
- Fix non-deterministic category ordering (categories is a set)
- Add comprehensive test suite (32 tests)
- Generate docs for 444 blocks with 66 preserved manual sections
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
<!-- Clearly explain the need for these changes: -->
### Changes 🏗️
<!-- Concisely describe all of the changes made in this pull request:
-->
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
<!-- Put your test plan here: -->
- [x] Extensively test code generation for the docs pages
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> Introduces an automated documentation pipeline for blocks and
integrates it into CI.
>
> - Adds `scripts/generate_block_docs.py` (+ tests) to introspect blocks
and generate `docs/integrations/**`, preserving `<!-- MANUAL: -->`
sections
> - New CI workflows: **docs-block-sync** (fails if docs drift),
**docs-claude-review** (AI review for block/docs PRs), and
**docs-enhance** (optional LLM improvements)
> - Updates existing Claude workflows to use `CLAUDE_CODE_OAUTH_TOKEN`
instead of `ANTHROPIC_API_KEY`
> - Improves numerous block descriptions/typos and links across backend
blocks to standardize docs output
> - Commits initial generated docs including
`docs/integrations/README.md` and many provider/category pages
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
631e53e0f6. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
1.4 KiB
1.4 KiB
Jina Chunking
Blocks for splitting text into semantic chunks using Jina AI.
Jina Chunking
What it is
Chunks texts using Jina AI's segmentation service
How it works
This block uses Jina AI's segmentation service to split texts into semantically meaningful chunks. Unlike simple splitting by character count, Jina's chunking preserves semantic coherence, making it ideal for RAG applications.
Configure maximum chunk length and optionally return token information for each chunk.
Inputs
| Input | Description | Type | Required |
|---|---|---|---|
| texts | List of texts to chunk | List[Any] | Yes |
| max_chunk_length | Maximum length of each chunk | int | No |
| return_tokens | Whether to return token information | bool | No |
Outputs
| Output | Description | Type |
|---|---|---|
| error | Error message if the operation failed | str |
| chunks | List of chunked texts | List[Any] |
| tokens | List of token information for each chunk | List[Any] |
Possible use case
RAG Preprocessing: Chunk documents for retrieval-augmented generation systems.
Embedding Preparation: Split long texts into optimal chunks for embedding generation.
Document Processing: Break down large documents for analysis or storage in vector databases.