Differentiating Automatically-generated Caption #199

teron131 · 2024-08-26T02:59:05Z

How to mark the fetched caption lang code to differentiate whether the caption is automatically generated?

Currently:

from pytubefix import YouTube

# Using url where only auto-gen English caption is available
yt = YouTube(sample_urls[0])
yt.caption_tracks[0]
# <Caption lang="English" code="en">

yt.captions["en"]
# <Caption lang="English" code="en">

I see that the source code included captions.py:

class Caption:
    """Container for caption tracks."""

    def __init__(self, caption_track: Dict):
        """Construct a :class:`Caption <Caption>`.

        :param dict caption_track:
            Caption track data extracted from ``watch_html``.
        """
        self.url = caption_track.get("baseUrl")

        # Certain videos have runs instead of simpleText
        #  this handles that edge case
        name_dict = caption_track['name']
        if 'simpleText' in name_dict:
            self.name = name_dict['simpleText']
        else:
            for el in name_dict['runs']:
                if 'text' in el:
                    self.name = el['text']

        # Use "vssId" instead of "languageCode", fix issue #779
        self.code = caption_track["vssId"]
        # Remove preceding '.' for backwards compatibility, e.g.:
        # English -> vssId: .en, languageCode: en
        # English (auto-generated) -> vssId: a.en, languageCode: en
        self.code = self.code.strip('.')

How to make the function calls to get "a.en"?

The text was updated successfully, but these errors were encountered:

teron131 added the enhancement New feature or request label Aug 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Differentiating Automatically-generated Caption #199

Differentiating Automatically-generated Caption #199

teron131 commented Aug 26, 2024 •

edited

Loading

Differentiating Automatically-generated Caption #199

Differentiating Automatically-generated Caption #199

Comments

teron131 commented Aug 26, 2024 • edited Loading

teron131 commented Aug 26, 2024 •

edited

Loading