Commit graph

1721 commits

Author SHA1 Message Date
Misty De Méo
14ccd6f4e7 deps: specify more extras for yt-dlp 2025-06-04 13:21:34 -07:00
Misty De Méo
94920b8b98 ci: two fixes to dependabot action
Some checks failed
Python Formatting Check / formatting (push) Has been cancelled
Tests / Run tests (push) Has been cancelled
These were necessary in qa for the automated yt-dlp action to run to completion.
2025-05-30 13:27:42 -07:00
Misty De Méo
f8ede3d605 ci: remember to merge after approving
Some checks are pending
Python Formatting Check / formatting (push) Waiting to run
Tests / Run tests (push) Waiting to run
2025-05-29 16:40:55 -07:00
Misty De Méo
944dc4c478 ci: install chrome before uv sync 2025-05-29 16:40:55 -07:00
Misty De Méo
3513da068d tests: separate out youtube tests
Some checks failed
Python Formatting Check / formatting (push) Has been cancelled
Tests / Run tests (push) Has been cancelled
Right now we expect these YouTube tests to fail for reasons unrelated to
yt-dlp. We still want to try them, but we won't count them towards
capture failures.
2025-05-28 13:06:54 -07:00
Misty De Méo
7d3155652e pyproject: remove dynamic fields
Some checks are pending
Python Formatting Check / formatting (push) Waiting to run
Tests / Run tests (push) Waiting to run
At this point might as well move the rest of these into the pyproject.toml,
taking them out of setup.py entirely.
2025-05-27 16:53:44 -07:00
Misty De Méo
189f669998 deps: move to pyproject.toml
Some checks are pending
Python Formatting Check / formatting (push) Waiting to run
Tests / Run tests (push) Waiting to run
Dependabot seems to be having trouble parsing our extras; see if
this fixes it.
2025-05-27 15:41:59 -07:00
Misty De Méo
943acd35d6 fix: dependabot.yml location 2025-05-27 15:18:43 -07:00
Misty De Méo
794f7dd98d ci: set up a yt-dlp test script
This runs every time we get a new yt-dlp version - we test to see if
this script is able to download at least 3/5 out of a set of videos
we've defined. If it succeeds, we go ahead and automatically merge
the new yt-dlp version into the qa branch so that we can test
further.
2025-05-27 15:09:15 -07:00
Misty De Méo
b4d2726e54 dependabot: add yt-dlp in qa
Also sets up an auto-approve for dependabot PRs.
2025-05-27 15:09:15 -07:00
Misty De Méo
2984bd955b warcprox 2.7.0
Some checks failed
Python Formatting Check / formatting (push) Has been cancelled
Tests / Run tests (push) Has been cancelled
2025-05-23 15:51:53 -07:00
Misty De Méo
bed0599d6e ci: fix publish needs write permissions
Applies fixes from https://github.com/internetarchive/warcprox/pull/220
2025-05-23 15:47:07 -07:00
Barbara Miller
aadd9cd521
bump version: 1.6.13
Some checks are pending
Python Formatting Check / formatting (push) Waiting to run
Tests / Run tests (push) Waiting to run
2025-05-22 14:45:10 -07:00
Barbara Miller
6b249478cc
Merge pull request #355 from mikemccabe/mccabe/disable-auto-https
Try new flag to disable auto http->https
2025-05-22 14:44:03 -07:00
Mike McCabe
d6e079d8cb Try new flag to disable auto http->https
Fix for https://webarchive.jira.com/browse/WWM-2292 (as seen by pyspn)
2025-05-21 21:07:42 -07:00
Barbara Miller
370638a876
bump version: 1.6.12
Some checks failed
Python Formatting Check / formatting (push) Has been cancelled
Tests / Run tests (push) Has been cancelled
2025-05-19 15:20:10 -07:00
Barbara Miller
cb4a846f4a
Merge pull request #348 from internetarchive/adam/new_claim_sites_query
feat: Create new claim_sites() query, and fix frontier tests
2025-05-19 15:19:06 -07:00
Barbara Miller
8b1d80fcc3
bump version: 1.6.11 2025-05-19 12:30:08 -07:00
Barbara Miller
79d288bf17
Merge pull request #353 from galgeek/barbara/misc_ytdlp
ytdlp config updates for saved livestreams (mostly)
2025-05-19 12:24:07 -07:00
Barbara Miller
a665d49bba
Merge pull request #350 from mistydemeo/misty/add_thread_to_dict
Some checks failed
Python Formatting Check / formatting (push) Has been cancelled
Tests / Run tests (push) Has been cancelled
cli: add thread name to event dict
2025-05-15 12:42:04 -07:00
Misty De Méo
b3fbdceeca CI: add a simple tag-to-release config
Some checks failed
Python Formatting Check / formatting (push) Has been cancelled
Tests / Run tests (push) Has been cancelled
This adds a tag-to-release Actions config based around uv.

This is triggered by pushing a tag with a new version; it will
automatically kick off this job, which will publish the new
version to PyPI on completion. We can push that tag from a PR
or directly to master.

At the moment, this doesn't do anything to automatically create
a GitHub release from the tag; we can do that manually for now,
but if we're interested I can add something to automatically
generate the release too.

We don't need to provide a token to uv to publish; instead, we
just need to configure the repo for PyPI access using this:
https://docs.pypi.org/trusted-publishers/adding-a-publisher/
2025-05-08 15:30:42 -07:00
Barbara Miller
1c59a076b5 stop skipping HLS, too 2025-05-07 18:41:49 -07:00
Barbara Miller
3dfcc2ade6 don't skip dash, do impersonate, smaller sleep intervals 2025-05-07 15:20:33 -07:00
Misty De Méo
aa86928154 cli: add thread name to event dict
I missed this item from the old log formatting config in the previous
structlog PRs.
2025-05-02 15:57:13 -07:00
Adam Miller
d36313f08f chore: ruff format pass 2025-04-15 14:05:54 -07:00
Adam Miller
0f57188a2c refactor: short circuit claimable sites loop when we have enough sites 2025-04-15 14:03:15 -07:00
Adam Miller
f0d527cda7 chore: merge logged proxy info into existing log call 2025-04-15 13:40:37 -07:00
Adam Miller
cdb81496f6 chore: disable cluster tests, add frontier load test 2025-04-01 14:16:42 -07:00
Adam Miller
addf73f865 chore: Additional frontier testing and reformat 2025-03-31 16:03:44 -07:00
Adam Miller
e7e4225bf2 chore: fixing more tests 2025-03-27 17:12:17 -07:00
Adam Miller
b5ee8a9ea7 feat: Create new claim_sites() query, and fix frontier tests 2025-03-26 18:06:55 -07:00
Adam Miller
42b4a88c96
Merge pull request #347 from internetarchive/adam/annotate_claim_sites
Some checks failed
Python Formatting Check / formatting (push) Has been cancelled
Tests / Run tests (3.12) (push) Has been cancelled
Tests / Run tests (3.8) (push) Has been cancelled
Full test suite / Run tests (push) Has been cancelled
chore: annotate claim_sites()
2025-03-26 10:19:17 -07:00
Adam Miller
ae82d6fc13 chore: reformat with ruff 2025-03-26 10:11:00 -07:00
Adam Miller
fd633c32bf chore: additional claim_sites() annotation 2025-03-25 14:34:52 -07:00
Adam Miller
c249aa1728 chore: annotate claim_sites() 2025-03-21 17:09:47 -07:00
Misty De Méo
7fc45fe6d0 brozzler 1.6.10
Some checks failed
Python Formatting Check / formatting (push) Has been cancelled
Tests / Run tests (3.12) (push) Has been cancelled
Tests / Run tests (3.8) (push) Has been cancelled
2025-03-12 12:05:25 -07:00
Misty De Méo
af34639adb test: fix test brozzler imports
Some checks are pending
Python Formatting Check / formatting (push) Waiting to run
Tests / Run tests (3.12) (push) Waiting to run
Tests / Run tests (3.8) (push) Waiting to run
2025-03-11 11:12:15 -07:00
Misty De Méo
3ef0c3abc9
Merge pull request #345 from mistydemeo/fix_worker_id
Some checks are pending
Python Formatting Check / formatting (push) Waiting to run
fix: bind worker_id inside BrozzlerWorker
2025-03-10 14:42:06 -07:00
Misty De Méo
a902ae7a02 fix: bind worker_id inside BrozzlerWorker
This ensures the parameter remains available within a multithreaded context.
2025-03-10 14:03:14 -07:00
Misty De Méo
353cc1b9fd deps: install setuptools on python 3.12+
Some checks failed
Python Formatting Check / formatting (push) Has been cancelled
Tests / Run tests (3.12) (push) Has been cancelled
Tests / Run tests (3.8) (push) Has been cancelled
distutils was removed beginning in Python 3.12, but it's used at
runtime by rethinkdb 2.4.9. setuptools provides a copy of distutils,
so we should make sure to install it when we're on Python 3.12 or
newer until we're able to upgrade to a version of rethinkdb that
no longer needs it.

See: https://www.python.org/downloads/release/python-3120/
2025-03-07 17:13:54 -08:00
Gretchen Leigh Miller
65b0b5f50b
Makefile improvements + pre-commit hook (#340)
* Makefile improvements + pre-commit hook

* update make target in CI

* fix CI more

* .gitignore update

* couple more Makefile refinements

* make target-version explicit on ruff import sorting
2025-03-07 16:45:53 -08:00
Gretchen Leigh Miller
f64db214d4
ruff linting fixes (#343)
* ruff linting fixes

* move imports back down to where they're re-exported
2025-03-07 16:03:35 -08:00
Gretchen Leigh Miller
6f011cc6c8
ruff import sorting pass + adding uv.lock (#342)
Some checks are pending
Python Formatting Check / formatting (push) Waiting to run
Tests / Run tests (3.12) (push) Waiting to run
Tests / Run tests (3.8) (push) Waiting to run
* ruff import sorting pass

* add uv.lock

* move comment back to its proper place
2025-03-07 10:04:11 -08:00
Misty De Méo
21102ca95c
__init__.py: rework imports (#334)
Some checks are pending
Python Formatting Check / formatting (push) Waiting to run
Tests / Run tests (3.12) (push) Waiting to run
Tests / Run tests (3.8) (push) Waiting to run
* __init__.py: rework imports

Although doublethink is an optional dependency to allow brozzler to be
used as a library without it, in practice we had some mandatory import
statements that prevented brozzler from being imported without it.
This fixes that by gating off some of the imports and exports.

If doublethink is available, brozzler works as it is now. But if it
isn't, we make a few changes:

* brozzler.worker, brozzler.cli and brozzler.model reexports are
  disabled
* One brozzler.cli function, which is used outside brozzler's own cli,
  has been moved into brozzler's __init__.py. For compatibility, it's
  reexported from brozzler.cli.

* Make tz-aware datetime of the epoch with stdlib

* Only import yt-dlp if we're using it

* ydl: never try if extra missing

* cli: use worker's yt-dlp check

---------

Co-authored-by: Alex Dempsey <avdempsey@archive.org>
2025-03-06 14:49:22 -08:00
Misty De Méo
0f707dc02b CI: extend daily job timeout
Some checks are pending
Python Formatting Check / formatting (push) Waiting to run
Tests / Run tests (3.12) (push) Waiting to run
Tests / Run tests (3.8) (push) Waiting to run
This was left at the default of six hours, but it timed out last
might. I'll set it at eight hours to see if this is more reliable.
2025-03-05 17:10:27 -08:00
Gretchen Leigh Miller
5350c202dc
Update README.rst to remove brozzler-easy and Wayback sections + other cleanup (#336)
* update instructions for brozzler-easy + add pywb extras

* revert pywb extra + updated README

* ruffing up

* more README.rst updates

* revert https change for local URL scheme
2025-03-05 14:33:47 -08:00
Misty De Méo
05b72906bd remove travis config 2025-03-05 13:34:03 -08:00
Misty De Méo
b45e5dc096 CLI: add new --worker-id option
This adds a new commandline flag allowing the worker ID to be specified.
If present, it will be added to the global context so that it will be
included in every logging statement.

Previously, we only had some indirect values to tie logging statements
to specific workers, so this should make it easier to follow.
2025-03-05 11:01:50 -08:00
Misty De Méo
af0f3ed378 CLI: enable log prefixing
This adds a commandline option which enables log level prefixing.
These prefixes enable log level-based filtering in journalctl when
present so long as logs are going to the journal, and
`SyslogLevelPrefix=` is set to `true` (which it is by default).

For documentation: https://manpages.debian.org/testing/libsystemd-dev/sd-daemon.3.en.html
2025-03-05 11:01:50 -08:00
Misty De Méo
f384d0b830 deps: add dev deps to pyproject.toml 2025-03-05 10:07:29 -08:00