+ Add support for more domains
* [svt] Fix series extraction (#22297)
* [svt] Fix article extraction (#22897, #22919)
-* [soundcloud] Imporve private playlist/set tracks extraction (#3707)
+* [soundcloud] Improve private playlist/set tracks extraction (#3707)
version 2020.01.24
* [abcotvs] Relax URL regular expression and improve metadata extraction
(#18014)
* [channel9] Reduce response size
-* [adobetv] Improve extaction
+* [adobetv] Improve extraction
* Use OnDemandPagedList for list extractors
* Reduce show extraction requests
* Extract original video format and subtitles
* [dailymotion] Improve extraction
* Extract http formats included in m3u8 manifest
* Fix user extraction (#3553, #21415)
- + Add suport for User Authentication (#11491)
+ + Add support for User Authentication (#11491)
* Fix password protected videos extraction (#23176)
* Respect age limit option and family filter cookie value (#18437)
* Handle video url playlist query param
- [go90] Remove extractor
* [kakao] Remove raw request
+ [kakao] Extract format total bitrate
-* [daum] Fix VOD and Clip extracton (#15015)
+* [daum] Fix VOD and Clip extraction (#15015)
* [kakao] Improve extraction
+ Add support for embed URLs
+ Add support for Kakao Legacy vid based embed URLs
* Improve format extraction (#22123)
+ Extract uploader_id and uploader_url (#21916)
+ Extract all known thumbnails (#19071, #20659)
- * Fix extration for private playlists (#20976)
+ * Fix extraction for private playlists (#20976)
+ Add support for playlist embeds (#20976)
* Skip preview formats (#22806)
* [dplay] Improve extraction
* [hbo] Fix extraction and extract subtitles (#14629, #13709)
* [youtube] Extract srv[1-3] subtitle formats (#20566)
* [adultswim] Fix extraction (#18025)
-* [teamcoco] Fix extraction and add suport for subdomains (#17099, #20339)
+* [teamcoco] Fix extraction and add support for subdomains (#17099, #20339)
* [adn] Fix subtitle compatibility with ffmpeg
* [adn] Fix extraction and add support for positioning styles (#20549)
* [vk] Use unique video id (#17848)
Extractors
+ [wwe] Extract subtitles
-+ [wwe] Add support for playlistst (#14781)
++ [wwe] Add support for playlists (#14781)
+ [wwe] Add support for wwe.com (#14781, #17450)
* [vk] Detect geo restriction (#17767)
* [openload] Use original host during extraction (#18211)
* [youku] Update ccode (#14872)
* [mnet] Fix format extraction (#14883)
+ [xiami] Add Referer header to API request
-* [mtv] Correct scc extention in extracted subtitles (#13730)
+* [mtv] Correct scc extension in extracted subtitles (#13730)
* [vvvvid] Fix extraction for kenc videos (#13406)
+ [br] Add support for BR Mediathek videos (#14560, #14788)
+ [daisuki] Add support for motto.daisuki.com (#14681)
* [nexx] Extract more formats
+ [openload] Add support for openload.link (#14763)
* [empflix] Relax URL regular expression
-* [empflix] Fix extractrion
+* [empflix] Fix extraction
* [tnaflix] Don't modify download URLs (#14811)
- [gamersyde] Remove extractor
* [francetv:generationwhat] Fix extraction
* [yahoo] Bypass geo restriction for brightcove (#14210)
* [yahoo] Use extracted brightcove account id (#14210)
* [rtve:alacarta] Fix extraction (#14290)
-+ [yahoo] Add support for custom brigthcove embeds (#14210)
++ [yahoo] Add support for custom brightcove embeds (#14210)
+ [generic] Add support for Video.js embeds
+ [gfycat] Add support for /gifs/detail URLs (#14322)
* [generic] Fix infinite recursion for twitter:player URLs (#14339)
* [amcnetworks] Make rating optional (#12453)
* [cloudy] Fix extraction (#13737)
+ [nickru] Add support for nickelodeon.ru
-* [mtv] Improve thumbnal extraction
+* [mtv] Improve thumbnail extraction
* [nick] Automate geo-restriction bypass (#13711)
* [niconico] Improve error reporting (#13696)
+ [cda] Support birthday verification (#12789)
* [leeco] Fix extraction (#12974)
+ [pbs] Extract chapters
-* [amp] Imporove thumbnail and subtitles extraction
+* [amp] Improve thumbnail and subtitles extraction
* [foxsports] Fix extraction (#12945)
- [coub] Remove comment count extraction (#12941)
+ [rbmaradio] Add support for redbullradio.com URLs (#12687)
+ [npo:live] Add support for default URL (#12555)
* [mixcloud:playlist] Fix title, description and view count extraction (#12582)
-+ [thesun] Add suport for thesun.co.uk (#11298, #12674)
++ [thesun] Add support for thesun.co.uk (#11298, #12674)
+ [ceskateleveize:porady] Add support for porady (#7411, #12645)
* [ceskateleveize] Improve extraction and remove URL replacement hacks
+ [kaltura] Add support for iframe embeds (#12679)
* [funimation] Fix extraction (#10696, #11773)
+ [xfileshare] Add support for vidabc.com (#12589)
+ [xfileshare] Improve extraction and extract hls formats
-+ [crunchyroll] Pass geo verifcation proxy
++ [crunchyroll] Pass geo verification proxy
+ [cwtv] Extract ISM formats
+ [tvplay] Bypass geo restriction
+ [vrv] Add support for vrv.co
+ [bostonglobe] Add extractor for bostonglobe.com (#12099)
+ [toongoggles] Add support for toongoggles.com (#12171)
+ [medialaan] Add support for Medialaan sites (#9974, #11912)
-+ [discoverynetworks] Add support for more domains and bypass geo restiction
++ [discoverynetworks] Add support for more domains and bypass geo restriction
* [openload] Fix extraction (#10408)
Fixed/improved extractors
- youtube
- ard
-- srmediatek (#9373)
+- srmediathek (#9373)
version 2016.07.09
- kaltura (#5557)
- la7
- Changed features
-- Rename --cn-verfication-proxy to --geo-verification-proxy
+- Rename --cn-verification-proxy to --geo-verification-proxy
Miscellaneous
- Add script for displaying downloads statistics
"writeinfojson": true,
"writesubtitles": false,
"allsubtitles": false,
- "listssubtitles": false,
+ "listsubtitles": false,
"socket_timeout": 20,
"fixup": "never"
}
# HTMLParseError has been deprecated in Python 3.3 and removed in
# Python 3.5. Introducing dummy exception for Python >3.5 for compatible
- # and uniform cross-version exceptiong handling
+ # and uniform cross-version exception handling
class compat_HTMLParseError(Exception):
pass
]
@classmethod
- def _build_brighcove_url(cls, object_str):
+ def _build_brightcove_url(cls, object_str):
"""
Build a Brightcove url from a xml string containing
<object class="BrightcoveExperience">{params}</object>
return cls._make_brightcove_url(params)
@classmethod
- def _build_brighcove_url_from_js(cls, object_js):
+ def _build_brightcove_url_from_js(cls, object_js):
# The layout of JS is as follows:
# customBC.createVideo = function (width, height, playerID, playerKey, videoPlayer, VideoRandomID) {
# // build Brightcove <object /> XML
).+?>\s*</object>''',
webpage)
if matches:
- return list(filter(None, [cls._build_brighcove_url(m) for m in matches]))
+ return list(filter(None, [cls._build_brightcove_url(m) for m in matches]))
matches = re.findall(r'(customBC\.createVideo\(.+?\);)', webpage)
if matches:
return list(filter(None, [
- cls._build_brighcove_url_from_js(custom_bc)
+ cls._build_brightcove_url_from_js(custom_bc)
for custom_bc in matches]))
return [src for _, src in re.findall(
r'<iframe[^>]+src=([\'"])((?:https?:)?//link\.brightcove\.com/services/player/(?!\1).+)\1', webpage)]
# just the media without qualities renditions.
# Fortunately, master playlist can be easily distinguished from media
# playlist based on particular tags availability. As of [1, 4.3.3, 4.3.4]
- # master playlist tags MUST NOT appear in a media playist and vice versa.
+ # master playlist tags MUST NOT appear in a media playlist and vice versa.
# As of [1, 4.3.3.1] #EXT-X-TARGETDURATION tag is REQUIRED for every
# media playlist and MUST NOT appear in master playlist thus we can
# clearly detect media playlist with this criterion.
title = get_item('title', preferred_langs) or video_id
description = get_item('description', preferred_langs)
- thumbnmail = xpath_text(playlist, './info/thumburl', 'thumbnail')
+ thumbnail = xpath_text(playlist, './info/thumburl', 'thumbnail')
upload_date = unified_strdate(xpath_text(playlist, './info/date', 'upload date'))
duration = parse_duration(xpath_text(playlist, './info/duration', 'duration'))
view_count = int_or_none(xpath_text(playlist, './info/views', 'views'))
'id': video_id,
'title': title,
'description': description,
- 'thumbnail': thumbnmail,
+ 'thumbnail': thumbnail,
'upload_date': upload_date,
'duration': duration,
'view_count': view_count,
'skip_download': True,
}
},
- # MTVSercices embed
+ # MTVServices embed
{
'url': 'http://www.vulture.com/2016/06/new-key-peele-sketches-released.html',
'md5': 'ca1aef97695ef2c1d6973256a57e5252',
duration = float_or_none(xpath_text(doc, 'DURATION'), scale=1000)
description = xpath_text(doc, 'ABSTRACT')
thumbnail = xpath_text(doc, './THUMBNAILIMAGE/FILENAME')
- createtion_time = timeconvert(xpath_text(doc, 'rfc822creationdate'))
+ creation_time = timeconvert(xpath_text(doc, 'rfc822creationdate'))
quality_options = doc.find('{http://search.yahoo.com/mrss/}group').findall('{http://search.yahoo.com/mrss/}content')
formats = []
'duration': duration,
'formats': formats,
'thumbnail': thumbnail,
- 'timestamp': createtion_time,
+ 'timestamp': creation_time,
}
},
}],
}, {
- # mutlimedia, not media title
+ # multimedia, not media title
'url': 'https://www.npr.org/2017/06/19/533198237/tigers-jaw-tiny-desk-concert',
'info_dict': {
'id': '533198237',
if media_id:
return media_id, presumptive_id, upload_date, description
- # Fronline video embedded via flp
+ # Frontline video embedded via flp
video_id = self._search_regex(
r'videoid\s*:\s*"([\d+a-z]{7,})"', webpage, 'videoid', default=None)
if video_id:
class SoundcloudPagedPlaylistBaseIE(SoundcloudIE):
def _extract_playlist(self, base_url, playlist_id, playlist_title):
- # Per the SoundCloud documentation, the maximum limit for a linked partioning query is 200.
+ # Per the SoundCloud documentation, the maximum limit for a linked partitioning query is 200.
# https://developers.soundcloud.com/blog/offset-pagination-deprecated
COMMON_QUERY = {
'limit': 200,
# return self._extract_via_api(kind, video_id)
# JSON api does not provide some audio formats (e.g. ogg) thus
- # extractiong audio via webpage
+ # extracting audio via webpage
webpage = self._download_webpage(url, video_id)
if m:
return [m.group('url')]
- # Are whitesapces ignored in URLs?
+ # Are whitespaces ignored in URLs?
# https://github.com/ytdl-org/youtube-dl/issues/12044
matches = re.findall(
r'(?s)<(?:iframe|script)[^>]+src=(["\'])((?:https?:)?//player\.theplatform\.com/p/.+?)\1', webpage)
content_id = xpath_text(video_data, 'contentId') or video_id
# rtmp_src = xpath_text(video_data, 'akamai/src')
# if rtmp_src:
- # splited_rtmp_src = rtmp_src.split(',')
- # if len(splited_rtmp_src) == 2:
- # rtmp_src = splited_rtmp_src[1]
+ # split_rtmp_src = rtmp_src.split(',')
+ # if len(split_rtmp_src) == 2:
+ # rtmp_src = split_rtmp_src[1]
# aifp = xpath_text(video_data, 'akamai/aifp', default='')
urls = []
}]
_PAGE_SIZE = 100
- def _fetch_page(self, album_id, authorizaion, hashed_pass, page):
+ def _fetch_page(self, album_id, authorization, hashed_pass, page):
api_page = page + 1
query = {
'fields': 'link,uri',
videos = self._download_json(
'https://api.vimeo.com/albums/%s/videos' % album_id,
album_id, 'Downloading page %d' % api_page, query=query, headers={
- 'Authorization': 'jwt ' + authorizaion,
+ 'Authorization': 'jwt ' + authorization,
})['data']
for video in videos:
link = video.get('link')
def _decrypt(origin):
n = int(origin[0])
origin = origin[1:]
- short_lenth = len(origin) // n
- long_num = len(origin) - short_lenth * n
+ short_length = len(origin) // n
+ long_num = len(origin) - short_length * n
l = tuple()
for i in range(0, n):
- length = short_lenth
+ length = short_length
if i < long_num:
length += 1
l += (origin[0:length], )
origin = origin[length:]
ans = ''
- for i in range(0, short_lenth + 1):
+ for i in range(0, short_length + 1):
for j in range(0, n):
if len(l[j]) > i:
ans += l[j][i]
# Parsing code and msg
if (self.code in (errno.ENOSPC, errno.EDQUOT)
- or 'No space left' in self.msg or 'Disk quota excedded' in self.msg):
+ or 'No space left' in self.msg or 'Disk quota exceeded' in self.msg):
self.reason = 'NO_SPACE'
elif self.code == errno.E2BIG or 'Argument list too long' in self.msg:
self.reason = 'VALUE_TOO_LONG'
# http://tools.ietf.org/html/rfc6381
if not codecs_str:
return {}
- splited_codecs = list(filter(None, map(
+ split_codecs = list(filter(None, map(
lambda str: str.strip(), codecs_str.strip().strip(',').split(','))))
vcodec, acodec = None, None
- for full_codec in splited_codecs:
+ for full_codec in split_codecs:
codec = full_codec.split('.')[0]
if codec in ('avc1', 'avc2', 'avc3', 'avc4', 'vp9', 'vp8', 'hev1', 'hev2', 'h263', 'h264', 'mp4v', 'hvc1', 'av01', 'theora'):
if not vcodec:
else:
write_string('WARNING: Unknown codec %s\n' % full_codec, sys.stderr)
if not vcodec and not acodec:
- if len(splited_codecs) == 2:
+ if len(split_codecs) == 2:
return {
- 'vcodec': splited_codecs[0],
- 'acodec': splited_codecs[1],
+ 'vcodec': split_codecs[0],
+ 'acodec': split_codecs[1],
}
else:
return {
def decode_packed_codes(code):
mobj = re.search(PACKED_CODES_RE, code)
- obfucasted_code, base, count, symbols = mobj.groups()
+ obfuscated_code, base, count, symbols = mobj.groups()
base = int(base)
count = int(count)
symbols = symbols.split('|')
return re.sub(
r'\b(\w+)\b', lambda mobj: symbol_table[mobj.group(0)],
- obfucasted_code)
+ obfuscated_code)
def caesar(s, alphabet, shift):