Commit graph

1720 commits

Author SHA1 Message Date
Barbara Miller
fd8b1570a4 bump qa version 2025-07-15 17:21:54 -07:00
Barbara Miller
935241102b Merge branch 'predup_type_playlist' into qa 2025-07-15 17:16:30 -07:00
Barbara Miller
5b967b9738 log urls we skipped adding to outlinks 2025-07-15 17:16:00 -07:00
Barbara Miller
5b92338d98 Merge branch 'predup_type_playlist' into qa 2025-07-15 14:43:58 -07:00
Barbara Miller
3af8247277 updates for QA deploy 2025-07-15 14:42:53 -07:00
Barbara Miller
bc987b2a27 add get_recent_video_capture, mostly 2025-07-14 17:07:33 -07:00
Barbara Miller
3c1328ff53 save_video_capture_record 2025-07-14 16:19:38 -07:00
Barbara Miller
701707c7ce save video_record here? 2025-07-14 16:19:38 -07:00
Barbara Miller
825de5f728 worker._video_data and seed_id mostly 2025-07-14 16:19:38 -07:00
Barbara Miller
d4e0aa67ec more new fields 2025-07-14 16:19:38 -07:00
Barbara Miller
2422e9da04 def create_video_capture_record minimally 2025-07-14 16:19:38 -07:00
Barbara Miller
2fc30b029c dataclass VideoCaptureRecord instead 2025-07-14 16:19:38 -07:00
Barbara Miller
43151027f3 dataclass VideoDataRecord 2025-07-14 16:19:38 -07:00
Barbara Miller
b4b950c0fc self._video_data in worker 2025-07-14 16:19:38 -07:00
Barbara Miller
db17335ffb fix github ruff issues 2025-07-14 16:19:34 -07:00
Barbara Miller
7d58a9ae3b keep it simple for now 2025-07-14 16:17:29 -07:00
Barbara Miller
046db4b6cc VideoDataClient, generalized 2025-07-14 16:17:29 -07:00
Barbara Miller
de4e7e0c08 VideoDataWrapper refined 2025-07-14 16:17:29 -07:00
Barbara Miller
f21d312ca9 initial interface update 2025-07-14 16:17:29 -07:00
Barbara Miller
667feae559 ruff import block fix 2025-07-14 16:17:29 -07:00
Barbara Miller
8dcac47ae8 type hint get_video_captures 2025-07-14 16:17:29 -07:00
Barbara Miller
af1aaeee34 containing_page_url_pattern update 2025-07-14 16:17:29 -07:00
Barbara Miller
0526eb816c make psycopg dependency optional 2025-07-14 16:17:29 -07:00
Barbara Miller
203d86f402 use job_conf.get() 2025-07-14 16:17:29 -07:00
Barbara Miller
fe5ad0c31d VIDEO_DATA_SOURCE 2025-07-14 16:17:29 -07:00
Barbara Miller
f925660eb4 skip ternary op for now 2025-07-14 16:17:29 -07:00
Barbara Miller
fd0e0d3f30 variable VIDEO_DATA 2025-07-14 16:17:29 -07:00
Barbara Miller
c0db5b9403 CONCAT 2025-07-14 16:17:29 -07:00
Barbara Miller
92a6cacb5f ruff format updates 2025-07-14 16:17:29 -07:00
Barbara Miller
03b329cd2a formatting fix 2025-07-14 16:17:29 -07:00
Barbara Miller
55e446a41a initial commit 2025-07-14 16:17:29 -07:00
Misty De Méo
f9cc2ea48e ci: test with 3.14 beta
Some checks failed
Python Formatting Check / formatting (push) Has been cancelled
Tests / Run tests (push) Has been cancelled
3.14 beta 4 is very late in the cycle, so it seems like a good time
for us to start testing with it to make sure we're ready.
2025-07-10 09:47:18 -07:00
Misty De Méo
aea4286bd1 ci: use uv 2025-07-10 09:41:09 -07:00
Misty De Méo
7b691fe397 worker: skip audio content-types for media exclusion
Some checks failed
Python Formatting Check / formatting (push) Has been cancelled
Tests / Run tests (push) Has been cancelled
2025-07-07 14:41:03 -07:00
Misty De Méo
a0f60c1051 Video exclusion: skip YouTube UMP packets too
Some checks failed
Python Formatting Check / formatting (push) Has been cancelled
Tests / Run tests (push) Has been cancelled
In testing a page with an embedded YouTube video with video
exclusion enabled, I found that brozzler ended up capturing about
30MB of UMP packets. We should be filtering those out too.
2025-06-26 17:13:24 -07:00
Misty De Méo
5ff893ddaf brozzler-new-site: add flag to disable videos
This makes it easier to test the new video exclusion work.
2025-06-26 14:38:15 -07:00
Misty De Méo
38f164dbc4 Makefile: remove target-version
Some checks are pending
Python Formatting Check / formatting (push) Waiting to run
Tests / Run tests (push) Waiting to run
This can be inferred from our pyproject.toml.
2025-06-26 09:04:51 -07:00
Misty De Méo
f9848efc1e tests: recognize CI=true 2025-06-26 09:04:51 -07:00
Misty De Méo
a4e5418e13 tests: enable format check 2025-06-26 09:04:51 -07:00
Misty De Méo
0f2c166e2a tests: use github-format in ci 2025-06-26 09:04:51 -07:00
Misty De Méo
422527d7e4 tests: ruff fixes
Some checks are pending
Python Formatting Check / formatting (push) Waiting to run
Tests / Run tests (push) Waiting to run
2025-06-25 15:50:39 -07:00
Misty De Méo
70e4c3d7f6 worker: fix possibly-unbound status code
We assigned this inside an exception handler, and allow
processing to continue on after catching the exception.
2025-06-25 15:42:56 -07:00
Misty De Méo
d33df40283 gitignore: ignore warcprox files
Some checks failed
Python Formatting Check / formatting (push) Has been cancelled
Tests / Run tests (push) Has been cancelled
These are created by some tests.
2025-06-12 15:45:42 -07:00
Misty De Méo
bee01d32b8 deps: yt-dlp 2025.05.22
Some checks are pending
Python Formatting Check / formatting (push) Waiting to run
Tests / Run tests (push) Waiting to run
2025-06-12 13:14:38 -07:00
Misty De Méo
8b20ea91bb Move classifiers from setup.py 2025-06-12 13:05:49 -07:00
Misty De Méo
33f60ce609 Drop Python 3.8 support
Python 3.8 is EOL since October. It's no longer supported by new versions
of yt-dlp, limiting video capture support. It's also no longer supported
by setuptools, which has complicated distribution - it's preventing us
from keeping packaging configuration up to date.
2025-06-12 12:55:17 -07:00
Misty De Méo
0227da6530 brozzler 1.7.0 2025-06-12 10:52:25 -07:00
Barbara Miller
a46d615365 HEADER_REQUEST_TIMEOUT = 60, not 30 2025-06-09 16:20:18 -07:00
Gretchen Leigh Miller
40613e35b4
WT-2950 Implement Seed-level video capture setting handling + Job-level PDF-only option
Some checks failed
Python Formatting Check / formatting (push) Has been cancelled
Tests / Run tests (push) Has been cancelled
2025-06-04 13:25:48 -07:00
Misty De Méo
14ccd6f4e7 deps: specify more extras for yt-dlp 2025-06-04 13:21:34 -07:00