mirror of
https://github.com/Significant-Gravitas/AutoGPT.git
synced 2026-01-07 22:33:57 -05:00
feat(backend): Add language fallback for YouTube transcription block (#11057)
## Problem The YouTube transcription block would fail when attempting to transcribe videos that only had transcripts available in non-English languages. Even when usable transcripts existed in other languages, the block would raise a `NoTranscriptFound` error because it only requested English transcripts. **Example video that would fail:** https://www.youtube.com/watch?v=3AMl5d2NKpQ (only has Hungarian transcripts) **Error message:** ``` Could not retrieve a transcript for the video https://www.youtube.com/watch?v=3AMl5d2NKpQ! No transcripts were found for any of the requested language codes: ('en',) For this video (3AMl5d2NKpQ) transcripts are available in the following languages: (GENERATED) - hu ("Hungarian (auto-generated)") ``` ## Solution Implemented intelligent language fallback in the `TranscribeYoutubeVideoBlock.get_transcript()` method: 1. **First**, tries to fetch English transcript (maintains backward compatibility) 2. **If English unavailable**, lists all available transcripts and selects the first one using this priority: - Manually created transcripts (any language) - Auto-generated transcripts (any language) 3. **Only fails** if no transcripts exist at all **Example behavior:** ```python # Before: Video with only Hungarian transcript get_transcript("3AMl5d2NKpQ") # ❌ Raises NoTranscriptFound # After: Video with only Hungarian transcript get_transcript("3AMl5d2NKpQ") # ✅ Returns Hungarian transcript ``` ## Changes - **Modified** `backend/blocks/youtube.py`: Added try-catch logic to fallback to any available language when English is not found - **Added** `test/blocks/test_youtube.py`: Comprehensive test suite covering URL extraction, language fallback, transcript preferences, and error handling (7 tests) - **Updated** `docs/content/platform/blocks/youtube.md`: Documented the language fallback behavior and transcript priority order ## Testing - ✅ All 7 new unit tests pass - ✅ Block integration test passes - ✅ Full test suite: 621 passed, 0 failed (no regressions) - ✅ Code formatting and linting pass ## Impact This fix enables the YouTube transcription block to work with international content while maintaining full backward compatibility: - ✅ Videos in any language can now be transcribed - ✅ English is still preferred when available - ✅ No breaking changes to existing functionality - ✅ Graceful degradation to available languages Fixes #10637 Fixes https://linear.app/autogpt/issue/OPEN-2626 > [!WARNING] > > <details> > <summary>Firewall rules blocked me from connecting to one or more addresses (expand for details)</summary> > > #### I tried to connect to the following addresses, but was blocked by firewall rules: > > - `www.youtube.com` > - Triggering command: `/home/REDACTED/.cache/pypoetry/virtualenvs/autogpt-platform-backend-Ajv4iu2i-py3.11/bin/python3` (dns block) > > If you need me to access, download, or install something from one of these locations, you can either: > > - Configure [Actions setup steps](https://gh.io/copilot/actions-setup-steps) to set up my environment, which run before the firewall is enabled > - Add the appropriate URLs or hosts to the custom allowlist in this repository's [Copilot coding agent settings](https://github.com/Significant-Gravitas/AutoGPT/settings/copilot/coding_agent) (admins only) > > </details> <!-- START COPILOT CODING AGENT SUFFIX --> <details> <summary>Original prompt</summary> > Issue Title: if theres only one lanague available for transcribe youtube return that langage not an error > Issue Description: `Could not retrieve a transcript for the video https://www.youtube.com/watch?v=3AMl5d2NKpQ! This is most likely caused by: No transcripts were found for any of the requested language codes: ('en',) For this video (3AMl5d2NKpQ) transcripts are available in the following languages: (MANUALLY CREATED) None (GENERATED) - hu ("Hungarian (auto-generated)") (TRANSLATION LANGUAGES) None If you are sure that the described cause is not responsible for this error and that a transcript should be retrievable, please create an issue at https://github.com/jdepoix/youtube-transcript-api/issues. Please add which version of youtube_transcript_api you are using and provide the information needed to replicate the error. Also make sure that there are no open issues which already describe your problem!` you can use this video to test: [https://www.youtube.com/watch?v=3AMl5d2NKpQ\`](https://www.youtube.com/watch?v=3AMl5d2NKpQ%60) > Fixes https://linear.app/autogpt/issue/OPEN-2626/if-theres-only-one-lanague-available-for-transcribe-youtube-return > > > Comment by User : > This thread is for an agent session with githubcopilotcodingagent. > > Comment by User : > This thread is for an agent session with githubcopilotcodingagent. > > Comment by User : > This comment thread is synced to a corresponding [GitHub issue](https://github.com/Significant-Gravitas/AutoGPT/issues/10637). All replies are displayed in both locations. > > </details> <!-- START COPILOT CODING AGENT TIPS --> --- ✨ Let Copilot coding agent [set things up for you](https://github.com/Significant-Gravitas/AutoGPT/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot) — coding agent works faster and does higher quality work when set up for your repo. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: ntindle <8845353+ntindle@users.noreply.github.com> Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
This commit is contained in:
@@ -1,6 +1,7 @@
|
||||
from urllib.parse import parse_qs, urlparse
|
||||
|
||||
from youtube_transcript_api._api import YouTubeTranscriptApi
|
||||
from youtube_transcript_api._errors import NoTranscriptFound
|
||||
from youtube_transcript_api._transcripts import FetchedTranscript
|
||||
from youtube_transcript_api.formatters import TextFormatter
|
||||
|
||||
@@ -64,7 +65,29 @@ class TranscribeYoutubeVideoBlock(Block):
|
||||
|
||||
@staticmethod
|
||||
def get_transcript(video_id: str) -> FetchedTranscript:
|
||||
return YouTubeTranscriptApi().fetch(video_id=video_id)
|
||||
"""
|
||||
Get transcript for a video, preferring English but falling back to any available language.
|
||||
|
||||
:param video_id: The YouTube video ID
|
||||
:return: The fetched transcript
|
||||
:raises: Any exception except NoTranscriptFound for requested languages
|
||||
"""
|
||||
api = YouTubeTranscriptApi()
|
||||
try:
|
||||
# Try to get English transcript first (default behavior)
|
||||
return api.fetch(video_id=video_id)
|
||||
except NoTranscriptFound:
|
||||
# If English is not available, get the first available transcript
|
||||
transcript_list = api.list(video_id)
|
||||
# Try manually created transcripts first, then generated ones
|
||||
available_transcripts = list(
|
||||
transcript_list._manually_created_transcripts.values()
|
||||
) + list(transcript_list._generated_transcripts.values())
|
||||
if available_transcripts:
|
||||
# Fetch the first available transcript
|
||||
return available_transcripts[0].fetch()
|
||||
# If no transcripts at all, re-raise the original error
|
||||
raise
|
||||
|
||||
@staticmethod
|
||||
def format_transcript(transcript: FetchedTranscript) -> str:
|
||||
|
||||
140
autogpt_platform/backend/test/blocks/test_youtube.py
Normal file
140
autogpt_platform/backend/test/blocks/test_youtube.py
Normal file
@@ -0,0 +1,140 @@
|
||||
from unittest.mock import Mock, patch
|
||||
|
||||
import pytest
|
||||
from youtube_transcript_api._errors import NoTranscriptFound
|
||||
from youtube_transcript_api._transcripts import FetchedTranscript, Transcript
|
||||
|
||||
from backend.blocks.youtube import TranscribeYoutubeVideoBlock
|
||||
|
||||
|
||||
class TestTranscribeYoutubeVideoBlock:
|
||||
"""Test cases for TranscribeYoutubeVideoBlock language fallback functionality."""
|
||||
|
||||
def setup_method(self):
|
||||
"""Set up test fixtures."""
|
||||
self.youtube_block = TranscribeYoutubeVideoBlock()
|
||||
|
||||
def test_extract_video_id_standard_url(self):
|
||||
"""Test extracting video ID from standard YouTube URL."""
|
||||
url = "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
|
||||
video_id = self.youtube_block.extract_video_id(url)
|
||||
assert video_id == "dQw4w9WgXcQ"
|
||||
|
||||
def test_extract_video_id_short_url(self):
|
||||
"""Test extracting video ID from shortened youtu.be URL."""
|
||||
url = "https://youtu.be/dQw4w9WgXcQ"
|
||||
video_id = self.youtube_block.extract_video_id(url)
|
||||
assert video_id == "dQw4w9WgXcQ"
|
||||
|
||||
def test_extract_video_id_embed_url(self):
|
||||
"""Test extracting video ID from embed URL."""
|
||||
url = "https://www.youtube.com/embed/dQw4w9WgXcQ"
|
||||
video_id = self.youtube_block.extract_video_id(url)
|
||||
assert video_id == "dQw4w9WgXcQ"
|
||||
|
||||
@patch("backend.blocks.youtube.YouTubeTranscriptApi")
|
||||
def test_get_transcript_english_available(self, mock_api_class):
|
||||
"""Test getting transcript when English is available."""
|
||||
# Setup mock
|
||||
mock_api = Mock()
|
||||
mock_api_class.return_value = mock_api
|
||||
mock_transcript = Mock(spec=FetchedTranscript)
|
||||
mock_api.fetch.return_value = mock_transcript
|
||||
|
||||
# Execute
|
||||
result = TranscribeYoutubeVideoBlock.get_transcript("test_video_id")
|
||||
|
||||
# Assert
|
||||
assert result == mock_transcript
|
||||
mock_api.fetch.assert_called_once_with(video_id="test_video_id")
|
||||
mock_api.list.assert_not_called()
|
||||
|
||||
@patch("backend.blocks.youtube.YouTubeTranscriptApi")
|
||||
def test_get_transcript_fallback_to_first_available(self, mock_api_class):
|
||||
"""Test fallback to first available language when English is not available."""
|
||||
# Setup mock
|
||||
mock_api = Mock()
|
||||
mock_api_class.return_value = mock_api
|
||||
|
||||
# Create mock transcript list with Hungarian transcript
|
||||
mock_transcript_list = Mock()
|
||||
mock_transcript_hu = Mock(spec=Transcript)
|
||||
mock_fetched_transcript = Mock(spec=FetchedTranscript)
|
||||
mock_transcript_hu.fetch.return_value = mock_fetched_transcript
|
||||
|
||||
# Set up the transcript list to have manually created transcripts empty
|
||||
# and generated transcripts with Hungarian
|
||||
mock_transcript_list._manually_created_transcripts = {}
|
||||
mock_transcript_list._generated_transcripts = {"hu": mock_transcript_hu}
|
||||
|
||||
# Mock API to raise NoTranscriptFound for English, then return list
|
||||
mock_api.fetch.side_effect = NoTranscriptFound(
|
||||
"test_video_id", ("en",), mock_transcript_list
|
||||
)
|
||||
mock_api.list.return_value = mock_transcript_list
|
||||
|
||||
# Execute
|
||||
result = TranscribeYoutubeVideoBlock.get_transcript("test_video_id")
|
||||
|
||||
# Assert
|
||||
assert result == mock_fetched_transcript
|
||||
mock_api.fetch.assert_called_once_with(video_id="test_video_id")
|
||||
mock_api.list.assert_called_once_with("test_video_id")
|
||||
mock_transcript_hu.fetch.assert_called_once()
|
||||
|
||||
@patch("backend.blocks.youtube.YouTubeTranscriptApi")
|
||||
def test_get_transcript_prefers_manually_created(self, mock_api_class):
|
||||
"""Test that manually created transcripts are preferred over generated ones."""
|
||||
# Setup mock
|
||||
mock_api = Mock()
|
||||
mock_api_class.return_value = mock_api
|
||||
|
||||
# Create mock transcript list with both manual and generated transcripts
|
||||
mock_transcript_list = Mock()
|
||||
mock_transcript_manual = Mock(spec=Transcript)
|
||||
mock_transcript_generated = Mock(spec=Transcript)
|
||||
mock_fetched_manual = Mock(spec=FetchedTranscript)
|
||||
mock_transcript_manual.fetch.return_value = mock_fetched_manual
|
||||
|
||||
# Set up the transcript list
|
||||
mock_transcript_list._manually_created_transcripts = {
|
||||
"es": mock_transcript_manual
|
||||
}
|
||||
mock_transcript_list._generated_transcripts = {"hu": mock_transcript_generated}
|
||||
|
||||
# Mock API to raise NoTranscriptFound for English
|
||||
mock_api.fetch.side_effect = NoTranscriptFound(
|
||||
"test_video_id", ("en",), mock_transcript_list
|
||||
)
|
||||
mock_api.list.return_value = mock_transcript_list
|
||||
|
||||
# Execute
|
||||
result = TranscribeYoutubeVideoBlock.get_transcript("test_video_id")
|
||||
|
||||
# Assert - should use manually created transcript first
|
||||
assert result == mock_fetched_manual
|
||||
mock_transcript_manual.fetch.assert_called_once()
|
||||
mock_transcript_generated.fetch.assert_not_called()
|
||||
|
||||
@patch("backend.blocks.youtube.YouTubeTranscriptApi")
|
||||
def test_get_transcript_no_transcripts_available(self, mock_api_class):
|
||||
"""Test that exception is re-raised when no transcripts are available at all."""
|
||||
# Setup mock
|
||||
mock_api = Mock()
|
||||
mock_api_class.return_value = mock_api
|
||||
|
||||
# Create mock transcript list with no transcripts
|
||||
mock_transcript_list = Mock()
|
||||
mock_transcript_list._manually_created_transcripts = {}
|
||||
mock_transcript_list._generated_transcripts = {}
|
||||
|
||||
# Mock API to raise NoTranscriptFound
|
||||
original_exception = NoTranscriptFound(
|
||||
"test_video_id", ("en",), mock_transcript_list
|
||||
)
|
||||
mock_api.fetch.side_effect = original_exception
|
||||
mock_api.list.return_value = mock_transcript_list
|
||||
|
||||
# Execute and assert exception is raised
|
||||
with pytest.raises(NoTranscriptFound):
|
||||
TranscribeYoutubeVideoBlock.get_transcript("test_video_id")
|
||||
@@ -7,7 +7,7 @@ A block that transcribes the audio content of a YouTube video into text.
|
||||
This block takes a YouTube video URL as input and produces a text transcript of the video's audio content. It also extracts and provides the unique video ID associated with the YouTube video.
|
||||
|
||||
### How it works
|
||||
The block first extracts the video ID from the provided YouTube URL. It then uses this ID to fetch the video's transcript. The transcript is processed and formatted into a readable text format. If any errors occur during this process, the block will capture and report them.
|
||||
The block first extracts the video ID from the provided YouTube URL. It then uses this ID to fetch the video's transcript, preferring English when available. If an English transcript is not available, the block will automatically use the first available transcript in any other language (prioritizing manually created transcripts over auto-generated ones). The transcript is processed and formatted into a readable text format. If any errors occur during this process, the block will capture and report them.
|
||||
|
||||
### Inputs
|
||||
| Input | Description |
|
||||
@@ -22,5 +22,5 @@ The block first extracts the video ID from the provided YouTube URL. It then use
|
||||
| Error | Any error message that occurs if the transcription process fails. |
|
||||
|
||||
### Possible use case
|
||||
A content creator could use this block to automatically generate subtitles for their YouTube videos. They could also use it to create text-based summaries of video content for SEO purposes or to make their content more accessible to hearing-impaired viewers.
|
||||
A content creator could use this block to automatically generate subtitles for their YouTube videos. They could also use it to create text-based summaries of video content for SEO purposes or to make their content more accessible to hearing-impaired viewers. The automatic language fallback feature ensures that transcripts can be obtained even from videos that only have subtitles in non-English languages.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user