feat(backend): Add language fallback for YouTube transcription block (#11057)

## Problem

The YouTube transcription block would fail when attempting to transcribe
videos that only had transcripts available in non-English languages.
Even when usable transcripts existed in other languages, the block would
raise a `NoTranscriptFound` error because it only requested English
transcripts.

**Example video that would fail:**
https://www.youtube.com/watch?v=3AMl5d2NKpQ (only has Hungarian
transcripts)

**Error message:**
```
Could not retrieve a transcript for the video https://www.youtube.com/watch?v=3AMl5d2NKpQ! 
No transcripts were found for any of the requested language codes: ('en',)

For this video (3AMl5d2NKpQ) transcripts are available in the following languages:
(GENERATED) - hu ("Hungarian (auto-generated)")
```

## Solution

Implemented intelligent language fallback in the
`TranscribeYoutubeVideoBlock.get_transcript()` method:

1. **First**, tries to fetch English transcript (maintains backward
compatibility)
2. **If English unavailable**, lists all available transcripts and
selects the first one using this priority:
   - Manually created transcripts (any language)
   - Auto-generated transcripts (any language)
3. **Only fails** if no transcripts exist at all

**Example behavior:**
```python
# Before: Video with only Hungarian transcript
get_transcript("3AMl5d2NKpQ")  #  Raises NoTranscriptFound

# After: Video with only Hungarian transcript  
get_transcript("3AMl5d2NKpQ")  #  Returns Hungarian transcript
```

## Changes

- **Modified** `backend/blocks/youtube.py`: Added try-catch logic to
fallback to any available language when English is not found
- **Added** `test/blocks/test_youtube.py`: Comprehensive test suite
covering URL extraction, language fallback, transcript preferences, and
error handling (7 tests)
- **Updated** `docs/content/platform/blocks/youtube.md`: Documented the
language fallback behavior and transcript priority order

## Testing

-  All 7 new unit tests pass
-  Block integration test passes
-  Full test suite: 621 passed, 0 failed (no regressions)
-  Code formatting and linting pass

## Impact

This fix enables the YouTube transcription block to work with
international content while maintaining full backward compatibility:

-  Videos in any language can now be transcribed
-  English is still preferred when available
-  No breaking changes to existing functionality
-  Graceful degradation to available languages

Fixes #10637
Fixes https://linear.app/autogpt/issue/OPEN-2626

> [!WARNING]
>
> <details>
> <summary>Firewall rules blocked me from connecting to one or more
addresses (expand for details)</summary>
>
> #### I tried to connect to the following addresses, but was blocked by
firewall rules:
>
> - `www.youtube.com`
> - Triggering command:
`/home/REDACTED/.cache/pypoetry/virtualenvs/autogpt-platform-backend-Ajv4iu2i-py3.11/bin/python3`
(dns block)
>
> If you need me to access, download, or install something from one of
these locations, you can either:
>
> - Configure [Actions setup
steps](https://gh.io/copilot/actions-setup-steps) to set up my
environment, which run before the firewall is enabled
> - Add the appropriate URLs or hosts to the custom allowlist in this
repository's [Copilot coding agent
settings](https://github.com/Significant-Gravitas/AutoGPT/settings/copilot/coding_agent)
(admins only)
>
> </details>

<!-- START COPILOT CODING AGENT SUFFIX -->



<details>

<summary>Original prompt</summary>

> Issue Title: if theres only one lanague available for transcribe
youtube return that langage not an error
> Issue Description: `Could not retrieve a transcript for the video
https://www.youtube.com/watch?v=3AMl5d2NKpQ! This is most likely caused
by: No transcripts were found for any of the requested language codes:
('en',) For this video (3AMl5d2NKpQ) transcripts are available in the
following languages: (MANUALLY CREATED) None (GENERATED) - hu
("Hungarian (auto-generated)") (TRANSLATION LANGUAGES) None If you are
sure that the described cause is not responsible for this error and that
a transcript should be retrievable, please create an issue at
https://github.com/jdepoix/youtube-transcript-api/issues. Please add
which version of youtube_transcript_api you are using and provide the
information needed to replicate the error. Also make sure that there are
no open issues which already describe your problem!` you can use this
video to test:
[https://www.youtube.com/watch?v=3AMl5d2NKpQ\`](https://www.youtube.com/watch?v=3AMl5d2NKpQ%60)
> Fixes
https://linear.app/autogpt/issue/OPEN-2626/if-theres-only-one-lanague-available-for-transcribe-youtube-return
> 
> 
> Comment by User :
> This thread is for an agent session with githubcopilotcodingagent.
> 
> Comment by User :
> This thread is for an agent session with githubcopilotcodingagent.
> 
> Comment by User :
> This comment thread is synced to a corresponding [GitHub
issue](https://github.com/Significant-Gravitas/AutoGPT/issues/10637).
All replies are displayed in both locations.
> 
> 


</details>


<!-- START COPILOT CODING AGENT TIPS -->
---

 Let Copilot coding agent [set things up for
you](https://github.com/Significant-Gravitas/AutoGPT/issues/new?title=+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot)
— coding agent works faster and does higher quality work when set up for
your repo.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: ntindle <8845353+ntindle@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
This commit is contained in:
Copilot
2025-10-20 21:31:33 -05:00
committed by GitHub
parent eba67e0a4b
commit 90af8f8e1a
3 changed files with 166 additions and 3 deletions

View File

@@ -1,6 +1,7 @@
from urllib.parse import parse_qs, urlparse
from youtube_transcript_api._api import YouTubeTranscriptApi
from youtube_transcript_api._errors import NoTranscriptFound
from youtube_transcript_api._transcripts import FetchedTranscript
from youtube_transcript_api.formatters import TextFormatter
@@ -64,7 +65,29 @@ class TranscribeYoutubeVideoBlock(Block):
@staticmethod
def get_transcript(video_id: str) -> FetchedTranscript:
return YouTubeTranscriptApi().fetch(video_id=video_id)
"""
Get transcript for a video, preferring English but falling back to any available language.
:param video_id: The YouTube video ID
:return: The fetched transcript
:raises: Any exception except NoTranscriptFound for requested languages
"""
api = YouTubeTranscriptApi()
try:
# Try to get English transcript first (default behavior)
return api.fetch(video_id=video_id)
except NoTranscriptFound:
# If English is not available, get the first available transcript
transcript_list = api.list(video_id)
# Try manually created transcripts first, then generated ones
available_transcripts = list(
transcript_list._manually_created_transcripts.values()
) + list(transcript_list._generated_transcripts.values())
if available_transcripts:
# Fetch the first available transcript
return available_transcripts[0].fetch()
# If no transcripts at all, re-raise the original error
raise
@staticmethod
def format_transcript(transcript: FetchedTranscript) -> str:

View File

@@ -0,0 +1,140 @@
from unittest.mock import Mock, patch
import pytest
from youtube_transcript_api._errors import NoTranscriptFound
from youtube_transcript_api._transcripts import FetchedTranscript, Transcript
from backend.blocks.youtube import TranscribeYoutubeVideoBlock
class TestTranscribeYoutubeVideoBlock:
"""Test cases for TranscribeYoutubeVideoBlock language fallback functionality."""
def setup_method(self):
"""Set up test fixtures."""
self.youtube_block = TranscribeYoutubeVideoBlock()
def test_extract_video_id_standard_url(self):
"""Test extracting video ID from standard YouTube URL."""
url = "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
video_id = self.youtube_block.extract_video_id(url)
assert video_id == "dQw4w9WgXcQ"
def test_extract_video_id_short_url(self):
"""Test extracting video ID from shortened youtu.be URL."""
url = "https://youtu.be/dQw4w9WgXcQ"
video_id = self.youtube_block.extract_video_id(url)
assert video_id == "dQw4w9WgXcQ"
def test_extract_video_id_embed_url(self):
"""Test extracting video ID from embed URL."""
url = "https://www.youtube.com/embed/dQw4w9WgXcQ"
video_id = self.youtube_block.extract_video_id(url)
assert video_id == "dQw4w9WgXcQ"
@patch("backend.blocks.youtube.YouTubeTranscriptApi")
def test_get_transcript_english_available(self, mock_api_class):
"""Test getting transcript when English is available."""
# Setup mock
mock_api = Mock()
mock_api_class.return_value = mock_api
mock_transcript = Mock(spec=FetchedTranscript)
mock_api.fetch.return_value = mock_transcript
# Execute
result = TranscribeYoutubeVideoBlock.get_transcript("test_video_id")
# Assert
assert result == mock_transcript
mock_api.fetch.assert_called_once_with(video_id="test_video_id")
mock_api.list.assert_not_called()
@patch("backend.blocks.youtube.YouTubeTranscriptApi")
def test_get_transcript_fallback_to_first_available(self, mock_api_class):
"""Test fallback to first available language when English is not available."""
# Setup mock
mock_api = Mock()
mock_api_class.return_value = mock_api
# Create mock transcript list with Hungarian transcript
mock_transcript_list = Mock()
mock_transcript_hu = Mock(spec=Transcript)
mock_fetched_transcript = Mock(spec=FetchedTranscript)
mock_transcript_hu.fetch.return_value = mock_fetched_transcript
# Set up the transcript list to have manually created transcripts empty
# and generated transcripts with Hungarian
mock_transcript_list._manually_created_transcripts = {}
mock_transcript_list._generated_transcripts = {"hu": mock_transcript_hu}
# Mock API to raise NoTranscriptFound for English, then return list
mock_api.fetch.side_effect = NoTranscriptFound(
"test_video_id", ("en",), mock_transcript_list
)
mock_api.list.return_value = mock_transcript_list
# Execute
result = TranscribeYoutubeVideoBlock.get_transcript("test_video_id")
# Assert
assert result == mock_fetched_transcript
mock_api.fetch.assert_called_once_with(video_id="test_video_id")
mock_api.list.assert_called_once_with("test_video_id")
mock_transcript_hu.fetch.assert_called_once()
@patch("backend.blocks.youtube.YouTubeTranscriptApi")
def test_get_transcript_prefers_manually_created(self, mock_api_class):
"""Test that manually created transcripts are preferred over generated ones."""
# Setup mock
mock_api = Mock()
mock_api_class.return_value = mock_api
# Create mock transcript list with both manual and generated transcripts
mock_transcript_list = Mock()
mock_transcript_manual = Mock(spec=Transcript)
mock_transcript_generated = Mock(spec=Transcript)
mock_fetched_manual = Mock(spec=FetchedTranscript)
mock_transcript_manual.fetch.return_value = mock_fetched_manual
# Set up the transcript list
mock_transcript_list._manually_created_transcripts = {
"es": mock_transcript_manual
}
mock_transcript_list._generated_transcripts = {"hu": mock_transcript_generated}
# Mock API to raise NoTranscriptFound for English
mock_api.fetch.side_effect = NoTranscriptFound(
"test_video_id", ("en",), mock_transcript_list
)
mock_api.list.return_value = mock_transcript_list
# Execute
result = TranscribeYoutubeVideoBlock.get_transcript("test_video_id")
# Assert - should use manually created transcript first
assert result == mock_fetched_manual
mock_transcript_manual.fetch.assert_called_once()
mock_transcript_generated.fetch.assert_not_called()
@patch("backend.blocks.youtube.YouTubeTranscriptApi")
def test_get_transcript_no_transcripts_available(self, mock_api_class):
"""Test that exception is re-raised when no transcripts are available at all."""
# Setup mock
mock_api = Mock()
mock_api_class.return_value = mock_api
# Create mock transcript list with no transcripts
mock_transcript_list = Mock()
mock_transcript_list._manually_created_transcripts = {}
mock_transcript_list._generated_transcripts = {}
# Mock API to raise NoTranscriptFound
original_exception = NoTranscriptFound(
"test_video_id", ("en",), mock_transcript_list
)
mock_api.fetch.side_effect = original_exception
mock_api.list.return_value = mock_transcript_list
# Execute and assert exception is raised
with pytest.raises(NoTranscriptFound):
TranscribeYoutubeVideoBlock.get_transcript("test_video_id")

View File

@@ -7,7 +7,7 @@ A block that transcribes the audio content of a YouTube video into text.
This block takes a YouTube video URL as input and produces a text transcript of the video's audio content. It also extracts and provides the unique video ID associated with the YouTube video.
### How it works
The block first extracts the video ID from the provided YouTube URL. It then uses this ID to fetch the video's transcript. The transcript is processed and formatted into a readable text format. If any errors occur during this process, the block will capture and report them.
The block first extracts the video ID from the provided YouTube URL. It then uses this ID to fetch the video's transcript, preferring English when available. If an English transcript is not available, the block will automatically use the first available transcript in any other language (prioritizing manually created transcripts over auto-generated ones). The transcript is processed and formatted into a readable text format. If any errors occur during this process, the block will capture and report them.
### Inputs
| Input | Description |
@@ -22,5 +22,5 @@ The block first extracts the video ID from the provided YouTube URL. It then use
| Error | Any error message that occurs if the transcription process fails. |
### Possible use case
A content creator could use this block to automatically generate subtitles for their YouTube videos. They could also use it to create text-based summaries of video content for SEO purposes or to make their content more accessible to hearing-impaired viewers.
A content creator could use this block to automatically generate subtitles for their YouTube videos. They could also use it to create text-based summaries of video content for SEO purposes or to make their content more accessible to hearing-impaired viewers. The automatic language fallback feature ensures that transcripts can be obtained even from videos that only have subtitles in non-English languages.