-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathwatch.py
More file actions
58 lines (46 loc) · 2.43 KB
/
watch.py
File metadata and controls
58 lines (46 loc) · 2.43 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
"""
It turns out that (most) YouTube videos can be embedded in other websites, just
like the above. For instance, if you visit https://youtu.be/xvFZjo5PgG0 on a
laptop or desktop, click Share, and then click Embed, you’ll see HTML (the
language in which web pages are written) like the below, which you could then
copy into your own website’s source code, wherein iframe is an HTML “element,”
and src is one of several HTML “attributes” therein, the value of which, between
quotes, is https://www.youtube.com/embed/xvFZjo5PgG0.
<iframe width="560" height="315" src="https://www.youtube.com/embed/xvFZjo5PgG0"
title="YouTube video player" frameborder="0" allow="accelerometer; autoplay;
clipboard-write; encrypted-media; gyroscope; picture-in-picture"
allowfullscreen></iframe>
Because some HTML attributes are optional, you could instead minimally embed
just the below.
<iframe src="https://www.youtube.com/embed/xvFZjo5PgG0"></iframe>
Suppose that you’d like to extract the URLs of YouTube videos that are embedded
in pages (e.g., https://www.youtube.com/embed/xvFZjo5PgG0), converting them back
to shorter, shareable youtu.be URLs (e.g., https://youtu.be/xvFZjo5PgG0) where
they can be watched on YouTube itself.
In a file called watch.py, implement a function called parse that expects a str
of HTML as input, extracts any YouTube URL that’s the value of a src attribute
of an iframe element therein, and returns its shorter, shareable youtu.be
equivalent as a str. Expect that any such URL will be in one of the formats
below. Assume that the value of src will be surrounded by double quotes. And
assume that the input will contain no more than one such URL. If the input does
not contain any such URL at all, return None.
http://youtube.com/embed/xvFZjo5PgG0
https://youtube.com/embed/xvFZjo5PgG0
https://www.youtube.com/embed/xvFZjo5PgG0
Structure watch.py as follows, wherein you’re welcome to modify main and/or
implement other functions as you see fit, but you may not import any other
libraries. You’re welcome, but not required, to use re and/or sys.
"""
import re
def main():
print(parse(input("HTML: ")))
def parse(s: str) -> str:
# get youtube URL
pattern = r'(?:iframe.+?src=")(?:https?://)?(?:www\.)?youtube.com/embed/(\w+)'
clean_url = re.search(pattern, s)
if clean_url:
shorter_url = "https://youtu.be/" + clean_url.group(1)
return shorter_url
return None
if __name__ == "__main__":
main()