From 90af8f8e1aa4bc80c304db0f0ae7631adcb90fbe Mon Sep 17 00:00:00 2001
From: Copilot <198982749+Copilot@users.noreply.github.com>
Date: Mon, 20 Oct 2025 21:31:33 -0500
Subject: [PATCH] feat(backend): Add language fallback for YouTube
transcription block (#11057)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
## Problem
The YouTube transcription block would fail when attempting to transcribe
videos that only had transcripts available in non-English languages.
Even when usable transcripts existed in other languages, the block would
raise a `NoTranscriptFound` error because it only requested English
transcripts.
**Example video that would fail:**
https://www.youtube.com/watch?v=3AMl5d2NKpQ (only has Hungarian
transcripts)
**Error message:**
```
Could not retrieve a transcript for the video https://www.youtube.com/watch?v=3AMl5d2NKpQ!
No transcripts were found for any of the requested language codes: ('en',)
For this video (3AMl5d2NKpQ) transcripts are available in the following languages:
(GENERATED) - hu ("Hungarian (auto-generated)")
```
## Solution
Implemented intelligent language fallback in the
`TranscribeYoutubeVideoBlock.get_transcript()` method:
1. **First**, tries to fetch English transcript (maintains backward
compatibility)
2. **If English unavailable**, lists all available transcripts and
selects the first one using this priority:
- Manually created transcripts (any language)
- Auto-generated transcripts (any language)
3. **Only fails** if no transcripts exist at all
**Example behavior:**
```python
# Before: Video with only Hungarian transcript
get_transcript("3AMl5d2NKpQ") # ❌ Raises NoTranscriptFound
# After: Video with only Hungarian transcript
get_transcript("3AMl5d2NKpQ") # ✅ Returns Hungarian transcript
```
## Changes
- **Modified** `backend/blocks/youtube.py`: Added try-catch logic to
fallback to any available language when English is not found
- **Added** `test/blocks/test_youtube.py`: Comprehensive test suite
covering URL extraction, language fallback, transcript preferences, and
error handling (7 tests)
- **Updated** `docs/content/platform/blocks/youtube.md`: Documented the
language fallback behavior and transcript priority order
## Testing
- ✅ All 7 new unit tests pass
- ✅ Block integration test passes
- ✅ Full test suite: 621 passed, 0 failed (no regressions)
- ✅ Code formatting and linting pass
## Impact
This fix enables the YouTube transcription block to work with
international content while maintaining full backward compatibility:
- ✅ Videos in any language can now be transcribed
- ✅ English is still preferred when available
- ✅ No breaking changes to existing functionality
- ✅ Graceful degradation to available languages
Fixes #10637
Fixes https://linear.app/autogpt/issue/OPEN-2626
> [!WARNING]
>
>
> Firewall rules blocked me from connecting to one or more
addresses (expand for details)
>
> #### I tried to connect to the following addresses, but was blocked by
firewall rules:
>
> - `www.youtube.com`
> - Triggering command:
`/home/REDACTED/.cache/pypoetry/virtualenvs/autogpt-platform-backend-Ajv4iu2i-py3.11/bin/python3`
(dns block)
>
> If you need me to access, download, or install something from one of
these locations, you can either:
>
> - Configure [Actions setup
steps](https://gh.io/copilot/actions-setup-steps) to set up my
environment, which run before the firewall is enabled
> - Add the appropriate URLs or hosts to the custom allowlist in this
repository's [Copilot coding agent
settings](https://github.com/Significant-Gravitas/AutoGPT/settings/copilot/coding_agent)
(admins only)
>
>
Original prompt
> Issue Title: if theres only one lanague available for transcribe
youtube return that langage not an error
> Issue Description: `Could not retrieve a transcript for the video
https://www.youtube.com/watch?v=3AMl5d2NKpQ! This is most likely caused
by: No transcripts were found for any of the requested language codes:
('en',) For this video (3AMl5d2NKpQ) transcripts are available in the
following languages: (MANUALLY CREATED) None (GENERATED) - hu
("Hungarian (auto-generated)") (TRANSLATION LANGUAGES) None If you are
sure that the described cause is not responsible for this error and that
a transcript should be retrievable, please create an issue at
https://github.com/jdepoix/youtube-transcript-api/issues. Please add
which version of youtube_transcript_api you are using and provide the
information needed to replicate the error. Also make sure that there are
no open issues which already describe your problem!` you can use this
video to test:
[https://www.youtube.com/watch?v=3AMl5d2NKpQ\`](https://www.youtube.com/watch?v=3AMl5d2NKpQ%60)
> Fixes
https://linear.app/autogpt/issue/OPEN-2626/if-theres-only-one-lanague-available-for-transcribe-youtube-return
>
>
> Comment by User :
> This thread is for an agent session with githubcopilotcodingagent.
>
> Comment by User :
> This thread is for an agent session with githubcopilotcodingagent.
>
> Comment by User :
> This comment thread is synced to a corresponding [GitHub
issue](https://github.com/Significant-Gravitas/AutoGPT/issues/10637).
All replies are displayed in both locations.
>
>
---
✨ Let Copilot coding agent [set things up for
you](https://github.com/Significant-Gravitas/AutoGPT/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot)
— coding agent works faster and does higher quality work when set up for
your repo.
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: ntindle <8845353+ntindle@users.noreply.github.com>
Co-authored-by: Nicholas Tindle
---
.../backend/backend/blocks/youtube.py | 25 +++-
.../backend/test/blocks/test_youtube.py | 140 ++++++++++++++++++
docs/content/platform/blocks/youtube.md | 4 +-
3 files changed, 166 insertions(+), 3 deletions(-)
create mode 100644 autogpt_platform/backend/test/blocks/test_youtube.py
diff --git a/autogpt_platform/backend/backend/blocks/youtube.py b/autogpt_platform/backend/backend/blocks/youtube.py
index bb8c61449e..15d5699fff 100644
--- a/autogpt_platform/backend/backend/blocks/youtube.py
+++ b/autogpt_platform/backend/backend/blocks/youtube.py
@@ -1,6 +1,7 @@
from urllib.parse import parse_qs, urlparse
from youtube_transcript_api._api import YouTubeTranscriptApi
+from youtube_transcript_api._errors import NoTranscriptFound
from youtube_transcript_api._transcripts import FetchedTranscript
from youtube_transcript_api.formatters import TextFormatter
@@ -64,7 +65,29 @@ class TranscribeYoutubeVideoBlock(Block):
@staticmethod
def get_transcript(video_id: str) -> FetchedTranscript:
- return YouTubeTranscriptApi().fetch(video_id=video_id)
+ """
+ Get transcript for a video, preferring English but falling back to any available language.
+
+ :param video_id: The YouTube video ID
+ :return: The fetched transcript
+ :raises: Any exception except NoTranscriptFound for requested languages
+ """
+ api = YouTubeTranscriptApi()
+ try:
+ # Try to get English transcript first (default behavior)
+ return api.fetch(video_id=video_id)
+ except NoTranscriptFound:
+ # If English is not available, get the first available transcript
+ transcript_list = api.list(video_id)
+ # Try manually created transcripts first, then generated ones
+ available_transcripts = list(
+ transcript_list._manually_created_transcripts.values()
+ ) + list(transcript_list._generated_transcripts.values())
+ if available_transcripts:
+ # Fetch the first available transcript
+ return available_transcripts[0].fetch()
+ # If no transcripts at all, re-raise the original error
+ raise
@staticmethod
def format_transcript(transcript: FetchedTranscript) -> str:
diff --git a/autogpt_platform/backend/test/blocks/test_youtube.py b/autogpt_platform/backend/test/blocks/test_youtube.py
new file mode 100644
index 0000000000..82c9311ff3
--- /dev/null
+++ b/autogpt_platform/backend/test/blocks/test_youtube.py
@@ -0,0 +1,140 @@
+from unittest.mock import Mock, patch
+
+import pytest
+from youtube_transcript_api._errors import NoTranscriptFound
+from youtube_transcript_api._transcripts import FetchedTranscript, Transcript
+
+from backend.blocks.youtube import TranscribeYoutubeVideoBlock
+
+
+class TestTranscribeYoutubeVideoBlock:
+ """Test cases for TranscribeYoutubeVideoBlock language fallback functionality."""
+
+ def setup_method(self):
+ """Set up test fixtures."""
+ self.youtube_block = TranscribeYoutubeVideoBlock()
+
+ def test_extract_video_id_standard_url(self):
+ """Test extracting video ID from standard YouTube URL."""
+ url = "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
+ video_id = self.youtube_block.extract_video_id(url)
+ assert video_id == "dQw4w9WgXcQ"
+
+ def test_extract_video_id_short_url(self):
+ """Test extracting video ID from shortened youtu.be URL."""
+ url = "https://youtu.be/dQw4w9WgXcQ"
+ video_id = self.youtube_block.extract_video_id(url)
+ assert video_id == "dQw4w9WgXcQ"
+
+ def test_extract_video_id_embed_url(self):
+ """Test extracting video ID from embed URL."""
+ url = "https://www.youtube.com/embed/dQw4w9WgXcQ"
+ video_id = self.youtube_block.extract_video_id(url)
+ assert video_id == "dQw4w9WgXcQ"
+
+ @patch("backend.blocks.youtube.YouTubeTranscriptApi")
+ def test_get_transcript_english_available(self, mock_api_class):
+ """Test getting transcript when English is available."""
+ # Setup mock
+ mock_api = Mock()
+ mock_api_class.return_value = mock_api
+ mock_transcript = Mock(spec=FetchedTranscript)
+ mock_api.fetch.return_value = mock_transcript
+
+ # Execute
+ result = TranscribeYoutubeVideoBlock.get_transcript("test_video_id")
+
+ # Assert
+ assert result == mock_transcript
+ mock_api.fetch.assert_called_once_with(video_id="test_video_id")
+ mock_api.list.assert_not_called()
+
+ @patch("backend.blocks.youtube.YouTubeTranscriptApi")
+ def test_get_transcript_fallback_to_first_available(self, mock_api_class):
+ """Test fallback to first available language when English is not available."""
+ # Setup mock
+ mock_api = Mock()
+ mock_api_class.return_value = mock_api
+
+ # Create mock transcript list with Hungarian transcript
+ mock_transcript_list = Mock()
+ mock_transcript_hu = Mock(spec=Transcript)
+ mock_fetched_transcript = Mock(spec=FetchedTranscript)
+ mock_transcript_hu.fetch.return_value = mock_fetched_transcript
+
+ # Set up the transcript list to have manually created transcripts empty
+ # and generated transcripts with Hungarian
+ mock_transcript_list._manually_created_transcripts = {}
+ mock_transcript_list._generated_transcripts = {"hu": mock_transcript_hu}
+
+ # Mock API to raise NoTranscriptFound for English, then return list
+ mock_api.fetch.side_effect = NoTranscriptFound(
+ "test_video_id", ("en",), mock_transcript_list
+ )
+ mock_api.list.return_value = mock_transcript_list
+
+ # Execute
+ result = TranscribeYoutubeVideoBlock.get_transcript("test_video_id")
+
+ # Assert
+ assert result == mock_fetched_transcript
+ mock_api.fetch.assert_called_once_with(video_id="test_video_id")
+ mock_api.list.assert_called_once_with("test_video_id")
+ mock_transcript_hu.fetch.assert_called_once()
+
+ @patch("backend.blocks.youtube.YouTubeTranscriptApi")
+ def test_get_transcript_prefers_manually_created(self, mock_api_class):
+ """Test that manually created transcripts are preferred over generated ones."""
+ # Setup mock
+ mock_api = Mock()
+ mock_api_class.return_value = mock_api
+
+ # Create mock transcript list with both manual and generated transcripts
+ mock_transcript_list = Mock()
+ mock_transcript_manual = Mock(spec=Transcript)
+ mock_transcript_generated = Mock(spec=Transcript)
+ mock_fetched_manual = Mock(spec=FetchedTranscript)
+ mock_transcript_manual.fetch.return_value = mock_fetched_manual
+
+ # Set up the transcript list
+ mock_transcript_list._manually_created_transcripts = {
+ "es": mock_transcript_manual
+ }
+ mock_transcript_list._generated_transcripts = {"hu": mock_transcript_generated}
+
+ # Mock API to raise NoTranscriptFound for English
+ mock_api.fetch.side_effect = NoTranscriptFound(
+ "test_video_id", ("en",), mock_transcript_list
+ )
+ mock_api.list.return_value = mock_transcript_list
+
+ # Execute
+ result = TranscribeYoutubeVideoBlock.get_transcript("test_video_id")
+
+ # Assert - should use manually created transcript first
+ assert result == mock_fetched_manual
+ mock_transcript_manual.fetch.assert_called_once()
+ mock_transcript_generated.fetch.assert_not_called()
+
+ @patch("backend.blocks.youtube.YouTubeTranscriptApi")
+ def test_get_transcript_no_transcripts_available(self, mock_api_class):
+ """Test that exception is re-raised when no transcripts are available at all."""
+ # Setup mock
+ mock_api = Mock()
+ mock_api_class.return_value = mock_api
+
+ # Create mock transcript list with no transcripts
+ mock_transcript_list = Mock()
+ mock_transcript_list._manually_created_transcripts = {}
+ mock_transcript_list._generated_transcripts = {}
+
+ # Mock API to raise NoTranscriptFound
+ original_exception = NoTranscriptFound(
+ "test_video_id", ("en",), mock_transcript_list
+ )
+ mock_api.fetch.side_effect = original_exception
+ mock_api.list.return_value = mock_transcript_list
+
+ # Execute and assert exception is raised
+ with pytest.raises(NoTranscriptFound):
+ TranscribeYoutubeVideoBlock.get_transcript("test_video_id")
diff --git a/docs/content/platform/blocks/youtube.md b/docs/content/platform/blocks/youtube.md
index 691ec2f7bc..2ff32a8b0d 100644
--- a/docs/content/platform/blocks/youtube.md
+++ b/docs/content/platform/blocks/youtube.md
@@ -7,7 +7,7 @@ A block that transcribes the audio content of a YouTube video into text.
This block takes a YouTube video URL as input and produces a text transcript of the video's audio content. It also extracts and provides the unique video ID associated with the YouTube video.
### How it works
-The block first extracts the video ID from the provided YouTube URL. It then uses this ID to fetch the video's transcript. The transcript is processed and formatted into a readable text format. If any errors occur during this process, the block will capture and report them.
+The block first extracts the video ID from the provided YouTube URL. It then uses this ID to fetch the video's transcript, preferring English when available. If an English transcript is not available, the block will automatically use the first available transcript in any other language (prioritizing manually created transcripts over auto-generated ones). The transcript is processed and formatted into a readable text format. If any errors occur during this process, the block will capture and report them.
### Inputs
| Input | Description |
@@ -22,5 +22,5 @@ The block first extracts the video ID from the provided YouTube URL. It then use
| Error | Any error message that occurs if the transcription process fails. |
### Possible use case
-A content creator could use this block to automatically generate subtitles for their YouTube videos. They could also use it to create text-based summaries of video content for SEO purposes or to make their content more accessible to hearing-impaired viewers.
+A content creator could use this block to automatically generate subtitles for their YouTube videos. They could also use it to create text-based summaries of video content for SEO purposes or to make their content more accessible to hearing-impaired viewers. The automatic language fallback feature ensures that transcripts can be obtained even from videos that only have subtitles in non-English languages.