## ℹ️ Description
Currently, release changelogs only show the last commit message, which
doesn't provide sufficient visibility into all changes included in a
release. This PR improves the release workflow to use GitHub's
auto-generated release notes, providing a comprehensive changelog of all
commits and PRs since the previous release.
- Addresses the issue of insufficient release changelog detail
- Improves transparency for users reviewing what changed in each release
## 📋 Changes Summary
- Added `--generate-notes` flag to `gh release create` command in
`.github/workflows/build.yml`
- Renamed `COMMIT_MSG` environment variable to `LEGAL_NOTICE` for better
clarity
- Legal disclaimers now append after the auto-generated changelog
instead of replacing it
- The auto-generated notes will include:
- All commits since the last release
- All merged PRs since the last release
- Contributor attribution
- Automatic categorization (New Contributors, Full Changelog, etc.)
### ⚙️ Type of Change
Select the type(s) of change(s) included in this pull request:
- [ ] 🐞 Bug fix (non-breaking change which fixes an issue)
- [x] ✨ New feature (adds new functionality without breaking existing
usage)
- [ ] 💥 Breaking change (changes that might break existing user setups,
scripts, or configurations)
## ✅ Checklist
Before requesting a review, confirm the following:
- [x] I have reviewed my changes to ensure they meet the project's
standards.
- [x] I have tested my changes and ensured that all tests pass (`pdm run
test`).
- [x] I have formatted the code (`pdm run format`).
- [x] I have verified that linting passes (`pdm run lint`).
- [x] I have updated documentation where necessary.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Chores**
* Release process updated to embed a bilingual (English/German) legal
notice directly into generated release notes.
* Release creation now auto-generates notes using that legal notice so
published releases consistently include the legal text.
<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## ℹ️ Description
Fixes a race condition where ads were submitted before all images
finished uploading to the server, causing some images to be missing from
published ads.
- Link to the related issue(s): Issue #715
- The bot was submitting ads immediately after the last image
`send_file()` call completed, only waiting 1-2.5 seconds via
`web_sleep()`. This wasn't enough time for server-side image processing,
thumbnail generation, and DOM updates to complete, resulting in missing
images in published ads.
## 📋 Changes Summary
### Image Upload Verification (Initial Fix)
- Added thumbnail verification in `__upload_images()` method to wait for
all image thumbnails to appear in the DOM after upload
- Added configurable timeout `image_upload` to `TimeoutConfig` (default:
30s, minimum: 5s)
- Improved error messages to show expected vs actual image count when
upload times out
- Added German translations for new log messages and error messages
- Regenerated JSON schemas to include new timeout configuration
### Polling Performance & Crash Fix (Follow-up Fix)
- Fixed critical bug where `web_find_all()` would raise `TimeoutError`
when no thumbnails exist yet, causing immediate crash
- Wrapped DOM queries in `try/except TimeoutError` blocks to handle
empty results gracefully
- Changed polling to use `self._timeout("quick_dom")` (~1s with PR #718)
instead of default timeout
- Improved polling performance: reduced cycle time from ~2s to ~1.5s
- DOM queries are client-side only (no server load from frequent
polling)
**New configuration option:**
```yaml
timeouts:
image_upload: 30.0 # Total timeout for image upload and server-side processing
quick_dom: 1.0 # Per-poll timeout for thumbnail checks (adjustable via multiplier)
```
The bot now polls the DOM for `ul#j-pictureupload-thumbnails >
li.ui-sortable-handle` elements after uploading images, ensuring
server-side processing is complete before submitting the ad form.
### ⚙️ Type of Change
- [x] 🐞 Bug fix (non-breaking change which fixes an issue)
- [ ] ✨ New feature (adds new functionality without breaking existing
usage)
- [ ] 💥 Breaking change (changes that might break existing user setups,
scripts, or configurations)
## ✅ Checklist
Before requesting a review, confirm the following:
- [x] I have reviewed my changes to ensure they meet the project's
standards.
- [x] I have tested my changes and ensured that all tests pass (`pdm run
test`).
- [x] I have formatted the code (`pdm run format`).
- [x] I have verified that linting passes (`pdm run lint`).
- [x] I have updated documentation where necessary.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Image uploads now verify completion by waiting for all uploaded
thumbnails to appear before proceeding.
* **Improvements**
* Added a configurable image upload timeout (default 30s, minimum 5s).
* Improved timeout reporting: when thumbnails don’t appear in time, the
app returns clearer feedback showing expected vs. observed counts.
<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## ℹ️ Description
This PR updates the default browser configuration to be safe for
Chrome/Chromium 136+ out of the box.
Chrome 136+ (released March 2025) requires `--user-data-dir` to be
specified when using `--remote-debugging-port` for security reasons.
Since nodriver relies on remote debugging, the bot needs proper defaults
to avoid validation errors.
**Motivation:** Eliminate Chrome 136+ configuration validation errors
for fresh installations and ensure session persistence by default.
## 📋 Changes Summary
- Set `browser.arguments` default to include
`--user-data-dir=.temp/browser-profile`
- Set `browser.user_data_dir` default to `.temp/browser-profile`
(previously `None`)
- Regenerated JSON schema (`config.schema.json`) with new defaults
**Benefits:**
- ✅ Chrome 136+ compatible out of the box (no validation errors)
- ✅ Browser session/cookies persist across runs (better UX)
- ✅ Consistent with existing `.temp` directory pattern (update state,
caches)
- ✅ Already gitignored - no accidental commits of browser profiles
**No breaking changes:** Existing configs with explicit
`browser.arguments: []` continue to work (users can override defaults).
### ⚙️ Type of Change
- [x] ✨ New feature (adds new functionality without breaking existing
usage)
## ✅ Checklist
- [x] I have reviewed my changes to ensure they meet the project's
standards.
- [x] I have tested my changes and ensured that all tests pass (`pdm run
test`).
- [x] I have formatted the code (`pdm run format`).
- [x] I have verified that linting passes (`pdm run lint`).
- [x] I have updated documentation where necessary.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Bug Fixes**
* Standardized browser profile configuration with improved default user
data directory settings.
<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## ℹ️ Description
Added Webselect-Function for Input/Dropdown Combobox
PR for issue/missing feature #677
# Fixes / Enhancements
Finding Special Attributes Elements can fail because they are currently
only selected using the name="..." attributes of the HTML elements. If
it fails, ALSO fallback-handle selecting special attribute HTML elements
by ID instead / additionally. (For example the "brands" Input/Combobox
for Mens Shoes...
When trying to select a Value in a <select>, it does not only rely on
the actual Option value (xxx in the example <options
value="xxx">yyy</...>) but instead also on the displayed HTML value
(i.e. yyy in above example). This improves UX because the User doesnt
have to check the actual "value" of the Option but instead can check the
displayed Value from the Browsers Display directly.
Testcases for Webselect_Combobox were not added due to missing knowledge
about Async Mocking properly.
## 📋 Changes Summary
✅ Fixes & Enhancements
- New WebSelect Functionality
- Improved Element Detection for Special Attributes
- Enhanced <select> Option Matching Logic
This improves UX and test robustness — users no longer need to know the
exact underlying value, as matching also works with the visible label
shown in the browser.
🧩 Result
These updates make dropdown and combobox interactions more intuitive,
resilient, and user-friendly across diverse HTML structures.
### ⚙️ Type of Change
Select the type(s) of change(s) included in this pull request:
- [x] 🐞 Bug fix (non-breaking change which fixes an issue)
- [x] ✨ New feature (adds new functionality without breaking existing
usage)
- [ ] 💥 Breaking change (changes that might break existing user setups,
scripts, or configurations)
## ✅ Checklist
Before requesting a review, confirm the following:
- [x] I have reviewed my changes to ensure they meet the project's
standards.
- [ ] I have tested my changes and ensured that all tests pass (`pdm run
test`).
- [x] I have formatted the code (`pdm run format`).
- [x] I have verified that linting passes (`pdm run lint`).
- [x] I have updated documentation where necessary.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Bug Fixes**
* Field lookup now falls back to locating by ID when name lookup times
out.
* Option selection uses a two-pass match (value then displayed text);
JS-path failures now surface as timeouts.
* Error and log messages localized and clarified.
* **New Features**
* Support for combobox-style inputs: type into the input, open dropdown,
and select by visible text (handles special characters).
* **Tests**
* Added tests for combobox selection, missing dropdowns, no-match
errors, value-path selection, and special-character handling.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: Jens <1742418+1cu@users.noreply.github.com>
Co-authored-by: Claude <claude@anthropic.com>
## ℹ️ Description
Eliminate all blocking I/O operations in async contexts and modernize
file path handling by migrating from os.path to pathlib.Path.
- Link to the related issue(s): #692
- Get rid of the TODO in pyproject.toml
- The added debug logging will ease the troubleshooting for path related
issues.
## 📋 Changes Summary
- Enable ASYNC210, ASYNC230, ASYNC240, ASYNC250 Ruff rules
- Wrap blocking urllib.request.urlopen() in run_in_executor
- Wrap blocking file operations (open, write) in run_in_executor
- Replace blocking os.path calls with async helpers using
run_in_executor
- Replace blocking input() with await ainput()
- Migrate extract.py from os.path to pathlib.Path
- Use Path() constructor and / operator for path joining
- Use Path.mkdir(), Path.rename() in executor instead of os functions
- Create mockable _path_exists() and _path_is_dir() helpers
- Add debug logging for all file system operations
### ⚙️ Type of Change
Select the type(s) of change(s) included in this pull request:
- [X] 🐞 Bug fix (non-breaking change which fixes an issue)
- [ ] ✨ New feature (adds new functionality without breaking existing
usage)
- [ ] 💥 Breaking change (changes that might break existing user setups,
scripts, or configurations)
## ✅ Checklist
Before requesting a review, confirm the following:
- [X] I have reviewed my changes to ensure they meet the project's
standards.
- [X] I have tested my changes and ensured that all tests pass (`pdm run
test`).
- [X] I have formatted the code (`pdm run format`).
- [X] I have verified that linting passes (`pdm run lint`).
- [X] I have updated documentation where necessary.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Refactor**
* Made user prompt non‑blocking to improve responsiveness.
* Converted filesystem/path handling and prefs I/O to async‑friendly
operations; moved blocking network and file work to background tasks.
* Added async file/path helpers and async port‑check before browser
connections.
* **Tests**
* Expanded unit tests for path helpers, image download success/failure,
prefs writing, and directory creation/renaming workflows.
<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## ℹ️ Description
This PR addresses issue #708 by improving the README's About section to
make the bot's purpose clearer to new users. It also fixes a technical
inaccuracy in the configuration documentation.
- Link to the related issue(s): Issue #708
- **Motivation**: The current About section uses ambiguous terminology
("ads" instead of "listings") and doesn't clearly communicate what the
bot does. Additionally, the configuration example incorrectly documents
`shipping_costs` as available in `ad_defaults`, when it's only
implemented for per-ad configuration.
## 📋 Changes Summary
**About Section Improvements:**
- Changed "ads" to "listings" for clarity (addresses confusion mentioned
in #708)
- Added "Key Features" section with 6 concrete capabilities
- Added "Why This Project?" section explaining the rewrite and
differences from legacy client
- Preserved all legal disclaimers
**Configuration Documentation Fix:**
- Removed `shipping_costs` from `ad_defaults` example (not implemented
in `AdDefaults` Pydantic class)
- Added clarifying comment that `shipping_costs` and `shipping_options`
must be configured per-ad
- Verified `shipping_costs` remains documented in ad configuration
section
### ⚙️ Type of Change
- [x] 🐞 Bug fix (non-breaking change which fixes an issue)
- [ ] ✨ New feature (adds new functionality without breaking existing
usage)
- [ ] 💥 Breaking change (changes that might break existing user setups,
scripts, or configurations)
*Note: This is a documentation-only change with no code modifications.*
## ✅ Checklist
- [x] I have reviewed my changes to ensure they meet the project's
standards.
- [x] I have tested my changes and ensured that all tests pass (N/A -
documentation only).
- [x] I have formatted the code (N/A - documentation only).
- [x] I have verified that linting passes (N/A - documentation only).
- [x] I have updated documentation where necessary.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
## ℹ️ Description
This PR improves the CodeRabbit configuration to ensure all important
project files are reviewed while excluding only build artifacts and
temporary files.
The previous configuration used a blanket `!**/.*` exclusion that was
unintentionally filtering out the entire `.github` directory, including
workflows, dependabot config, issue templates, and CODEOWNERS files.
## 📋 Changes Summary
- **Added** `.github/**` to include all GitHub automation files
(workflows, dependabot, templates, CODEOWNERS)
- **Added** root config files (`pyproject.toml`, `*.yaml`, `*.yml`,
`**/*.md`)
- **Removed** overly broad `!**/.*` exclusion pattern
- **Added** specific exclusions for Python cache directories
(`.pytest_cache`, `.mypy_cache`, `.ruff_cache`)
- **Added** explicit IDE file exclusions (`.vscode`, `.idea`,
`.DS_Store`)
- **Added** `pdm.lock` exclusion to reduce noise
### ⚙️ Type of Change
- [x] 🐞 Bug fix (non-breaking change which fixes an issue)
- [ ] ✨ New feature (adds new functionality without breaking existing
usage)
- [ ] 💥 Breaking change (changes that might break existing user setups,
scripts, or configurations)
## ✅ Checklist
- [x] I have reviewed my changes to ensure they meet the project's
standards.
- [x] I have tested my changes and ensured that all tests pass (`pdm run
test`).
- [x] I have formatted the code (`pdm run format`).
- [x] I have verified that linting passes (`pdm run lint`).
- [x] I have updated documentation where necessary.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Chores**
* Updated internal code review configuration and automation settings.
<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## ℹ️ Description
This PR fixes the update logic for shipping options.
A different dialog sequence is used in some categories which must be
taken into account.
Also the selection of the correct shipping sizes was refactured.
- Link to the related issue(s): Issue #687
## 📋 Changes Summary
- An check was added if two dialogs have to be closed in update mode
- Logic for setting package siz was refactured in update mode
### ⚙️ Type of Change
Select the type(s) of change(s) included in this pull request:
- [x] 🐞 Bug fix (non-breaking change which fixes an issue)
- [ ] ✨ New feature (adds new functionality without breaking existing
usage)
- [ ] 💥 Breaking change (changes that might break existing user setups,
scripts, or configurations)
## ✅ Checklist
Before requesting a review, confirm the following:
- [x] I have reviewed my changes to ensure they meet the project's
standards.
- [x] I have tested my changes and ensured that all tests pass (`pdm run
test`).
- [x] I have formatted the code (`pdm run format`).
- [x] I have verified that linting passes (`pdm run lint`).
- [x] I have updated documentation where necessary.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Bug Fixes**
* Improved shipping-option handling when editing listings: the flow now
more reliably navigates the shipping dialog, correctly selects or
deselects options based on item size and the desired configuration, and
avoids incorrect selections across size categories—resulting in more
consistent shipping choices when modifying ads.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## ℹ️ Description
* Strengthen the session/logging/update-check tests to exercise real
resources and guards while bringing the update-check docs in line with
the supported interval units.
- Link to the related issue(s): Issue #N/A
## 📋 Changes Summary
- Reworked the `WebScrapingMixin` session tests so they capture each
`stop` handler before the browser reference is nulled, ensuring cleanup
logic is exercised without crashing.
- Added targeted publish and update-check tests that patch the async
helpers, guard logic, and logging handlers while confirming
`requests.get` is skipped when the state gate is closed.
- Updated `docs/update-check.md` to list only the actually supported
interval units (up to 30 days) and noted the new guard coverage in the
changelog.
### ⚙️ Type of Change
- [x] 🐞 Bug fix (non-breaking change which fixes an issue)
- [ ] ✨ New feature (adds new functionality without breaking existing
usage)
- [ ] 💥 Breaking change (changes that might break existing user setups,
scripts, or configurations)
## ✅ Checklist
- [x] I have reviewed my changes to ensure they meet the project's
standards.
- [x] I have tested my changes and ensured that all tests pass (`pdm run
test`).
- [x] I have formatted the code (`pdm run format`).
- [x] I have verified that linting passes (`pdm run lint`).
- [x] I have updated documentation where necessary.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Tests**
* Expanded test coverage for publish workflow orchestration and update
checking interval behavior.
* Added comprehensive browser session cleanup tests, including
idempotent operations and edge case handling.
* Consolidated logging configuration tests with improved handler
management validation.
* Refined test fixtures and assertions for better test reliability.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## ℹ️ Description
Currently version.py isn't checked by the linters. Ran linters manually
and fixed all lints.
- Link to the related issue(s): none
## 📋 Changes Summary
- Introduced shutil.which("git") so the helper explicitly locates the
Git binary and raises a clear error when it’s absent rather than relying
on a relative PATH.
- Switched to subprocess.run(..., capture_output=True, text=True) with
the located executable, guarding the call with check=True and # noqa:
S603 since the arguments are trusted.
- Made the timestamp timezone-aware with datetime.now(timezone.utc) to
avoid implicit local-time assumptions when creating the YYYY+<commit>
version string.
### ⚙️ Type of Change
Select the type(s) of change(s) included in this pull request:
- [x] 🐞 Bug fix (non-breaking change which fixes an issue)
- [ ] ✨ New feature (adds new functionality without breaking existing
usage)
- [ ] 💥 Breaking change (changes that might break existing user setups,
scripts, or configurations)
## ✅ Checklist
Before requesting a review, confirm the following:
- [x] I have reviewed my changes to ensure they meet the project's
standards.
- [x] I have tested my changes and ensured that all tests pass (`pdm run
test`).
- [x] I have formatted the code (`pdm run format`).
- [x] I have verified that linting passes (`pdm run lint`).
- [x] I have updated documentation where necessary.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
## ℹ️ Description
* Restrict coverage reporting to library files and collect per-suite
coverage data for Codecov’s flags.
- Link to the related issue(s): Issue #N/A
- Describe the motivation and context for this change.
## 📋 Changes Summary
- add `coverage:prepare` and per-suite `COVERAGE_FILE`s so each test
group writes its own sqlite and XML artifacts without appending
- replace the shell scripts with `scripts/coverage_helper.py`, scope the
report to `src/kleinanzeigen_bot/*`, and add logging/validation around
cleanup, pytest runs, and data combining
- ensure the helper works in CI (accepts extra pytest args, validates
file presence)
### ⚙️ Type of Change
- [x] 🐞 Bug fix (non-breaking change which fixes an issue)
- [ ] ✨ New feature (adds new functionality without breaking existing
usage)
- [ ] 💥 Breaking change (changes that might break existing user setups,
scripts, or configurations)
## ✅ Checklist
- [x] I have reviewed my changes to ensure they meet the project's
standards.
- [x] I have tested my changes and ensured that all tests pass (`pdm run
test`).
- [x] I have formatted the code (`pdm run format`).
- [x] I have verified that linting passes (`pdm run lint`).
- [x] I have updated documentation where necessary.
## ℹ️ Description
- Related issues: #671, #658
- Introduces configurable timeout controls plus retry/backoff handling
for flaky DOM operations.
We often see timeouts which are note reproducible in certain
configurations. I suspect timeout issues based on a combination of
internet speed, browser, os, age of the computer and the weather.
This PR introduces a comprehensive config model to tweak timeouts.
## 📋 Changes Summary
- add TimeoutConfig to the main config/schema and expose timeouts in
README/docs
- wire WebScrapingMixin, extractor, update checker, and browser
diagnostics to honor the configurable timeouts and retries
- update translations/tests to cover the new behaviour and ensure
lint/mypy/pyright pipelines remain green
### ⚙️ Type of Change
- [ ] 🐞 Bug fix (non-breaking change which fixes an issue)
- [x] ✨ New feature (adds new functionality without breaking existing
usage)
- [ ] 💥 Breaking change (changes that might break existing user setups,
scripts, or configurations)
## ✅ Checklist
- [x] I have reviewed my changes to ensure they meet the project's
standards.
- [x] I have tested my changes and ensured that all tests pass (`pdm run
test`).
- [x] I have formatted the code (`pdm run format`).
- [x] I have verified that linting passes (`pdm run lint`).
- [x] I have updated documentation where necessary.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Centralized, configurable timeout system for web interactions,
detection flows, publishing, and pagination.
* Optional retry with exponential backoff for operations that time out.
* **Improvements**
* Replaced fixed wait times with dynamic timeouts throughout workflows.
* More informative timeout-related messages and diagnostics.
* **Tests**
* New and expanded test coverage for timeout behavior, pagination,
diagnostics, and retry logic.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## ℹ️ Description
*Provide a concise summary of the changes introduced in this pull
request.*
- Link to the related issue(s): Issue #
- Describe the motivation and context for this change.
Refactors the test harness for faster and more reliable feedback: adds
deterministic time freezing for update checks, accelerates and refactors
smoke tests to run in-process, defaults pytest to xdist with durations
tracking, and adjusts CI triggers so PRs run the test matrix only once.
## 📋 Changes Summary
- add pytest-xdist + durations reporting defaults, force deterministic
locale and slow markers, and document the workflow adjustments
- run smoke tests in-process (no subprocess churn), mock update
checks/logging, and mark slow specs appropriately
- deflake update check interval tests by freezing datetime and simplify
FixedDateTime helper
- limit GitHub Actions `push` trigger to `main` so feature branches rely
on the single pull_request run
### ⚙️ Type of Change
Select the type(s) of change(s) included in this pull request:
- [ ] 🐞 Bug fix (non-breaking change which fixes an issue)
- [x] ✨ New feature (adds new functionality without breaking existing
usage)
- [ ] 💥 Breaking change (changes that might break existing user setups,
scripts, or configurations)
## ✅ Checklist
Before requesting a review, confirm the following:
- [x] I have reviewed my changes to ensure they meet the project's
standards.
- [x] I have tested my changes and ensured that all tests pass (`pdm run
test`).
- [x] I have formatted the code (`pdm run format`).
- [x] I have verified that linting passes (`pdm run lint`).
- [x] I have updated documentation where necessary.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Tests**
* Ensure tests run in a consistent English locale and restore prior
locale after each run
* Mark integration scraping tests as slow for clearer categorization
* Replace subprocess-based CLI tests with an in-process runner that
returns structured results and captures combined stdout/stderr/logs;
disable update checks during smoke tests
* Freeze current time in update-check tests for deterministic assertions
* Add mock for process enumeration in web‑scraping unit tests to
stabilize macOS-specific warnings
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## ℹ️ Description
- Link to the related issue(s): Issue #N/A
- Describe the motivation and context for this change.
Pin `nodriver` to the last known good 0.47 series so we can avoid the
UTF-8 decoding regression in 0.48.x that currently breaks our local
mypy/linting runs.
## 📋 Changes Summary
- lock runtime dependency `nodriver` to `0.47.*` with an inline comment
describing the upstream regression
- refresh `pdm.lock` so local/CI installs stay on the pinned version
### ⚙️ Type of Change
- [x] 🐞 Bug fix (non-breaking change which fixes an issue)
- [ ] ✨ New feature (adds new functionality without breaking existing
usage)
- [ ] 💥 Breaking change (changes that might break existing user setups,
scripts, or configurations)
## ✅ Checklist
Before requesting a review, confirm the following:
- [x] I have reviewed my changes to ensure they meet the project's
standards.
- [ ] I have tested my changes and ensured that all tests pass (`pdm run
test`).
- [ ] I have formatted the code (`pdm run format`).
- [ ] I have verified that linting passes (`pdm run lint`).
- [x] I have updated documentation where necessary.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
## ℹ️ Description
- Link to the related issue(s): Issue #667
- Harden breadcrumb category extraction so downloads no longer fail when
the breadcrumb structure changes.
## 📋 Changes Summary
- Parse breadcrumb anchors dynamically and fall back with debug logging
when legacy selectors are needed.
- Added unit coverage for multi-anchor, single-anchor, and fallback
scenarios to keep diff coverage above 80%.
- Documented required lint/format/test steps in PR checklist; no new
dependencies.
### ⚙️ Type of Change
- [x] 🐞 Bug fix (non-breaking change which fixes an issue)
- [ ] ✨ New feature (adds new functionality without breaking existing
usage)
- [ ] 💥 Breaking change (changes that might break existing user setups,
scripts, or configurations)
## ✅ Checklist
- [x] I have reviewed my changes to ensure they meet the project's
standards.
- [x] I have tested my changes and ensured that all tests pass (`pdm run
test`).
- [x] I have formatted the code (`pdm run format`).
- [x] I have verified that linting passes (`pdm run lint`).
- [x] I have updated documentation where necessary.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Bug Fixes**
* Improved category extraction accuracy with enhanced breadcrumb
parsing.
* Better handling for listings with a single breadcrumb (returns stable
category identifier).
* More resilient fallback when breadcrumb data is missing or malformed.
* Safer normalization of category identifiers to avoid incorrect parsing
across site variations.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->