Misty De Méo
ec268af922
worker: catch some missing statements
2025-02-21 09:17:05 -08:00
Misty De Méo
b33b2fed8c
robots: convert to structlog
2025-02-21 09:17:05 -08:00
Misty De Méo
ba0db01d32
pywb: convert to structlog
2025-02-21 09:17:05 -08:00
Misty De Méo
7634eb1b57
model: convert to structlog
2025-02-21 09:17:05 -08:00
Misty De Méo
d4b493f1ae
frontier: convert to structlog
2025-02-21 09:17:05 -08:00
Misty De Méo
cf6b423019
easy: convert to structlog
2025-02-21 09:17:05 -08:00
Misty De Méo
19dd7c97f0
chrome: convert to structlog
2025-02-21 09:17:05 -08:00
Misty De Méo
97f225d54c
browser: convert to structlog
2025-02-21 09:17:05 -08:00
Misty De Méo
a7e915b35f
__init__: port logging statements
2025-02-21 09:17:05 -08:00
Misty De Méo
712ecb9690
worker: port trace/notice
2025-02-21 09:17:05 -08:00
Misty De Méo
9bf9cf1382
convert ydl module logger
2025-02-21 09:17:05 -08:00
Misty De Méo
715a1471c0
initial stab at worker
2025-02-19 14:48:34 -08:00
Misty De Méo
69d682beb9
Merge pull request #324 from mistydemeo/mdfind
...
Publish Artifacts / Build distribution 📦 (push) Has been cancelled
Python Formatting Check / formatting (push) Has been cancelled
CLI: improve Chrome finding on Mac
2025-02-19 14:48:21 -08:00
Misty De Méo
53cac65540
CLI: improve Chrome finding on Mac
...
On macOS, we can find Chrome even if it's installed in a non-default
path by querying `mdfind`. This is the CLI entrypoint to Spotlight,
and we can use it to look up applications using their unique bundle
identifiers.
If `mdfind` fails to find anything, this falls back to the hardcoded
paths. This should ensure this still works if Spotlight indexing is
off, but Chrome is in the default path.
2025-02-19 13:55:18 -08:00
Barbara Miller
591ba3c95a
bump version
Publish Artifacts / Build distribution 📦 (push) Has been cancelled
Python Formatting Check / formatting (push) Has been cancelled
2025-02-14 13:05:38 -08:00
Barbara Miller
c63f4296a6
Merge pull request #323 from galgeek/bmiller/better_fetch_url_timeout_errors
...
better error handling for _fetch_url
2025-02-14 12:40:04 -08:00
Barbara Miller
71ffbddfeb
log _fetch_url completion
2025-02-14 10:38:24 -08:00
Barbara Miller
ba7031f2da
better exceptions for fetch_url
2025-02-14 09:39:41 -08:00
Barbara Miller
732a7943f0
http, not https, maybe
2025-02-13 17:57:53 -08:00
Barbara Miller
819a483227
black'd
2025-02-13 17:55:36 -08:00
Barbara Miller
9dca200230
cert_reqs="CERT_NONE"
2025-02-13 16:27:05 -08:00
Barbara Miller
4af48be6ca
use urllib3
2025-02-13 16:11:22 -08:00
Barbara Miller
2c9c040b84
black'd
2025-02-13 14:24:27 -08:00
Barbara Miller
53a1869def
better error handling for _fetch_url
2025-02-13 14:21:24 -08:00
Barbara Miller
5fccdd83e3
bump version to 1.6.8
Publish Artifacts / Build distribution 📦 (push) Has been cancelled
Python Formatting Check / formatting (push) Has been cancelled
2025-02-11 17:34:14 -08:00
Barbara Miller
bfc85f5e89
Merge pull request #322 from galgeek/bmiller/more_better_requests
...
requests timeout for fetch_url, plus user_agent
2025-02-11 17:33:14 -08:00
Barbara Miller
ca79c3a329
minor logging fix
2025-02-11 17:28:18 -08:00
Barbara Miller
430c0daf39
catch and log more exceptions on fetch_url error
2025-02-11 12:51:26 -08:00
Barbara Miller
561e0803c6
requests timeout and user_agent
2025-02-11 12:27:50 -08:00
Barbara Miller
65de0d2a5f
timeout for fetch_url
2025-02-09 11:13:03 -08:00
Adam Miller
7ededbc521
Merge pull request #318 from internetarchive/adam/get-page-header-timeout
...
Publish Artifacts / Build distribution 📦 (push) Has been cancelled
Python Formatting Check / formatting (push) Has been cancelled
feat: add timeout to header check
2025-02-06 11:22:28 -08:00
Adam Miller
8ed517c1c0
chore: bump version
2025-02-06 11:19:23 -08:00
Adam Miller
3afc63242b
fix: syntax bug on HEADER_REQUEST_TIMEOUT
2025-02-05 12:38:41 -08:00
Adam Miller
c5844dfdd6
chore: cleanup unused variable
2025-02-04 16:36:23 -08:00
Adam Miller
0feac5cd07
feat: add timeout to header check
2025-02-04 16:21:28 -08:00
Barbara Miller
df4bd148d5
bump version and update copyright
Publish Artifacts / Build distribution 📦 (push) Has been cancelled
Python Formatting Check / formatting (push) Has been cancelled
2025-01-23 16:26:16 -08:00
Barbara Miller
a749b2968b
Merge pull request #316 from galgeek/bmiller/shorter_behavior_timeout
...
shorter behavior timeout
2025-01-23 15:37:29 -08:00
Barbara Miller
5e701e9dbe
Merge pull request #315 from galgeek/bmiller/proxy_select
...
yt-dlp proxy handling update
2025-01-23 15:37:01 -08:00
Adam Miller
1e30b4f478
Merge pull request #312 from internetarchive/adam/patch-yt-dlp-infinite-loop-bug
...
feat: override yt-dlp generic extractor to add redirect loop detectio…
2025-01-23 15:30:56 -08:00
Barbara Miller
2905324435
behavior_timeout=300seconds
2025-01-23 14:56:44 -08:00
Barbara Miller
9e09782984
ytdlp_proxy_file param
2025-01-23 14:35:34 -08:00
Barbara Miller
b22349e281
black'd
2025-01-23 12:37:56 -08:00
Barbara Miller
baa33e3079
ytdlp_proxy
2025-01-23 12:17:07 -08:00
Barbara Miller
854970f4dd
black'd
2025-01-23 11:21:05 -08:00
Barbara Miller
170377fe89
yt-dlp proxy handling update
2025-01-23 10:58:32 -08:00
Adam Miller
493587ca2c
fix: return ie_result and cleanup variable names to properly represent hop depth instead of redirects
2025-01-15 12:00:07 -08:00
Adam Miller
a250eb2b68
fix: ensure url is not a video when determining if we are in a redirect
2025-01-06 18:56:22 -08:00
Adam Miller
5be1b3b22a
chore: formatting
2025-01-06 18:23:17 -08:00
Adam Miller
1596667919
chore: rewrite approach using process_ie_result
2025-01-06 18:20:30 -08:00
Adam Miller
426570b084
feat: Handle too many redirects as well
2025-01-06 11:30:46 -08:00