mirror of
https://github.com/Significant-Gravitas/AutoGPT.git
synced 2026-02-10 06:45:28 -05:00
57ecc10535ed9900a8b79554dca1841b67154c8b
3 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
b08761816a |
feat(backend): add getting user profile, drafts, update send email to use mulitple to, cc, bcc (#10482)
Need: The Gmail integration had several parsing issues that were causing data loss and workflow incompatibilities: 1. Email recipient parsing only captured the first recipient, losing CC/BCC and multiple TO recipients 2. Email body parsing was inconsistent between blocks, sometimes showing "This email does not contain a readable body" for valid emails 3. Type mismatches between blocks caused serialization issues when connecting them in workflows (lists being converted to string representations like "[\"email@example.com\"]") # Changes 🏗️ 1. Enhanced Email Model: - Added cc and bcc fields to capture all recipients - Changed to field from string to list for consistency - Now captures all recipients instead of just the first one 2. Improved Email Parsing: - Updated GmailReadBlock and GmailGetThreadBlock to parse all recipients using getaddresses() - Unified email body parsing logic across blocks with robust multipart handling - Added support for HTML to plain text conversion - Fixed handling of emails with attachments as body content 3. Fixed Block Compatibility: - Updated GmailSendBlock and GmailCreateDraftBlock to accept lists for recipient fields - Added validation to ensure at least one recipient is provided - All blocks now consistently use lists for recipient fields, preventing serialization issues 4. Updated Test Data: - Modified all test inputs/outputs to use the new list format for recipients - Ensures tests reflect the new data structure # Checklist 📋 For code changes: - I have clearly listed my changes in the PR description - I have made a test plan - I have tested my changes according to the test plan: - Run existing Gmail block unit tests with poetry run test - Create a workflow that reads emails with multiple recipients and verify all TO, CC, BCC recipients are captured - Test email body parsing with plain text, HTML, and multipart emails - Connect GmailReadBlock → GmailSendBlock in a workflow and verify recipient data flows correctly - Connect GmailReplyBlock → GmailSendBlock and verify no serialization errors occur - Test sending emails with multiple recipients via GmailSendBlock - Test creating drafts with multiple recipients via GmailCreateDraftBlock - Verify backwards compatibility by testing with single recipient strings (should now require lists) - Create from scratch and execute an agent with at least 3 blocks - Import an agent from file upload, and confirm it executes correctly - Upload agent to marketplace - Import an agent from marketplace and confirm it executes correctly - Edit an agent from monitor, and confirm it executes correctly # Breaking Change Note: The to field in GmailSendBlock and GmailCreateDraftBlock now requires a list instead of accepting both string and list. Existing workflows using strings will need to be updated to use lists (e.g., ["email@example.com"] instead of "email@example.com"). --------- Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co> |
||
|
|
423b22214a |
feat(blocks): Add Excel support to ReadSpreadsheetBlock and introduced FileReadBlock (#10393)
This PR adds Excel file support to CSV processing and enhances text file reading capabilities. ### Changes 🏗️ **ReadSpreadsheetBlock (formerly ReadCsvBlock):** - Renamed `ReadCsvBlock` to `ReadSpreadsheetBlock` for better clarity - Added Excel file support (.xlsx, .xls) with automatic conversion to CSV using pandas - Enhanced parameter `file_in` to `file_input` for consistency - Excel files are automatically detected by extension and converted to CSV format - Maintains all existing CSV processing functionality (delimiters, headers, etc.) - Graceful error handling when pandas library is not available **FileReadBlock:** - Enhanced text file reading with advanced chunking capabilities - Added parameters: `skip_size`, `skip_rows`, `row_limit`, `size_limit`, `delimiter` - Supports both character-based and row-based processing - Chunked output for large files based on size limits - Proper file handling with UTF-8 and latin-1 encoding fallbacks - Uses `store_media_file` for secure file processing (URLs, data URIs, local paths) - Fixed test input to use data URI instead of non-existent file **General Improvements:** - Consistent parameter naming across blocks (`file_input`) - Enhanced error handling and validation - Comprehensive test coverage - All existing functionality preserved ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Both ReadSpreadsheetBlock and FileReadBlock instantiate correctly - [x] ReadSpreadsheetBlock processes CSV data with existing functionality - [x] FileReadBlock reads text files with data URI input - [x] All block tests pass (457 passed, 83 skipped) - [x] No linting errors in modified files - [x] Excel support gracefully handles missing pandas dependency #### For configuration changes: - [ ] `.env.example` is updated or already compatible with my changes - [ ] `docker-compose.yml` is updated or already compatible with my changes - [ ] I have included a list of my configuration changes in the PR description (under **Changes**) *Note: No configuration changes required for this PR.* |
||
|
|
db1f034544 |
Fix Gmail body parsing for multipart messages (#9863) (#10071)
<!-- Clearly explain the need for these changes: --> The `GmailReadBlock._get_email_body()` method was only inspecting the top-level payload and a single `text/plain` part, causing it to return the fallback string "This email does not contain a text body." for most Gmail messages. This occurred because Gmail messages are typically wrapped in `multipart/alternative` or other multipart containers, which the original implementation couldn't handle. This critical issue made the Gmail integration unusable for reading email body content, as virtually every real Gmail message uses multipart MIME structures. <!-- Concisely describe all of the changes made in this pull request: --> ### Changes #### Core Implementation: - **Replaced simple `_get_email_body()` with recursive multipart parser** that can walk through nested MIME structures - **Added `_walk_for_body()` method** for recursive traversal of email parts with depth limiting (max 10 levels) - **Implemented safe base64 decoding** with automatic padding correction in `_decode_base64()` - **Added attachment body support** via `_download_attachment_body()` for emails where body content is stored as attachments #### Email Format Support: - **HTML to text conversion** using `html2text` library for HTML-only emails - **Multipart/alternative handling** with preference for `text/plain` over `text/html` - **Nested multipart structure support** (e.g., `multipart/mixed` containing `multipart/alternative`) - **Single-part email support** (maintains backward compatibility) #### Dependencies & Testing: - **Added `html2text = "^2024.2.26"`** to `pyproject.toml` for HTML conversion - **Created comprehensive unit tests** in `test/blocks/test_gmail.py` covering all email types and edge cases - **Added error handling and graceful fallbacks** for malformed data and missing dependencies #### Security & Performance: - **Recursion depth limiting** prevents infinite loops on malformed email structures - **Exception handling** ensures graceful degradation when API calls fail - **Efficient tree traversal** with early returns for better performance ### Checklist #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <details> <summary>Test Plan</summary> - **Single-part text/plain emails** - Verified correct extraction of plain text content - **Multipart/alternative emails** - Tested preference for plain text over HTML when both available - **HTML-only emails** - Confirmed HTML to text conversion works correctly - **Nested multipart structures** - Tested deeply nested `multipart/mixed` containing `multipart/alternative` - **Attachment-based body content** - Verified downloading and decoding of body stored as attachments - **Base64 padding edge cases** - Tested malformed base64 data with missing padding - **Recursion depth limits** - Confirmed protection against infinite recursion - **Error handling scenarios** - Tested graceful fallbacks for API failures and missing dependencies - **Backward compatibility** - Ensured existing functionality remains unchanged for edge cases - **Integration testing** - Ran standalone verification script with 100% test pass rate </details> #### For configuration changes: - [x] `.env.example` is updated or already compatible with my changes - [x] `docker-compose.yml` is updated or already compatible with my changes - [x] I have included a list of my configuration changes in the PR description (under **Changes**) <details> <summary>Configuration Changes</summary> - Added `html2text` dependency to `pyproject.toml` - no environment or infrastructure changes required - No changes to ports, services, secrets, or databases - Fully backward compatible with existing Gmail API configuration </details> --------- Co-authored-by: Toran Bruce Richards <toran.richards@gmail.com> Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co> |