mirror of
https://github.com/internetarchive/brozzler.git
synced 2025-05-02 06:36:20 -04:00
use urlcanon library for canonicalization, surtification, scope match rules
This commit is contained in:
parent
479f0f7e09
commit
12fb9eaa15
11 changed files with 78 additions and 232 deletions
|
@ -75,7 +75,6 @@ blocks:
|
|||
- domain: twitter.com
|
||||
url_match: REGEX_MATCH
|
||||
value: ^.*lang=(?!en).*$
|
||||
- bad_thing: bad rule should be ignored
|
||||
''')
|
||||
|
||||
site = brozzler.Site(None, {
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue