feat: add configurable timeouts (#673)

## ℹ️ Description
- Related issues: #671, #658
- Introduces configurable timeout controls plus retry/backoff handling
for flaky DOM operations.

We often see timeouts which are note reproducible in certain
configurations. I suspect timeout issues based on a combination of
internet speed, browser, os, age of the computer and the weather.

This PR introduces a comprehensive config model to tweak timeouts.

## 📋 Changes Summary
- add TimeoutConfig to the main config/schema and expose timeouts in
README/docs
- wire WebScrapingMixin, extractor, update checker, and browser
diagnostics to honor the configurable timeouts and retries
- update translations/tests to cover the new behaviour and ensure
lint/mypy/pyright pipelines remain green

### ⚙️ Type of Change
- [ ] 🐞 Bug fix (non-breaking change which fixes an issue)
- [x]  New feature (adds new functionality without breaking existing
usage)
- [ ] 💥 Breaking change (changes that might break existing user setups,
scripts, or configurations)

##  Checklist
- [x] I have reviewed my changes to ensure they meet the project's
standards.
- [x] I have tested my changes and ensured that all tests pass (`pdm run
test`).
- [x] I have formatted the code (`pdm run format`).
- [x] I have verified that linting passes (`pdm run lint`).
- [x] I have updated documentation where necessary.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Centralized, configurable timeout system for web interactions,
detection flows, publishing, and pagination.
* Optional retry with exponential backoff for operations that time out.

* **Improvements**
* Replaced fixed wait times with dynamic timeouts throughout workflows.
  * More informative timeout-related messages and diagnostics.

* **Tests**
* New and expanded test coverage for timeout behavior, pagination,
diagnostics, and retry logic.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
This commit is contained in:
Jens
2025-11-13 15:08:52 +01:00
committed by GitHub
parent ac678ed888
commit a3ac27c441
16 changed files with 972 additions and 121 deletions

View File

@@ -219,6 +219,8 @@ kleinanzeigen_bot/extract.py:
_extract_category_from_ad_page:
"Breadcrumb container 'vap-brdcrmb' not found; cannot extract ad category: %s": "Breadcrumb-Container 'vap-brdcrmb' nicht gefunden; kann Anzeigenkategorie nicht extrahieren: %s"
"Falling back to legacy breadcrumb selectors; collected ids: %s": "Weiche auf ältere Breadcrumb-Selektoren aus; gesammelte IDs: %s"
"Legacy breadcrumb selectors not found within %.1f seconds (collected ids: %s)": "Ältere Breadcrumb-Selektoren nicht innerhalb von %.1f Sekunden gefunden (gesammelte IDs: %s)"
"Unable to locate breadcrumb fallback selectors within %(seconds).1f seconds.": "Ältere Breadcrumb-Selektoren konnten nicht innerhalb von %(seconds).1f Sekunden gefunden werden."
#################################################
kleinanzeigen_bot/utils/i18n.py:
@@ -398,11 +400,6 @@ kleinanzeigen_bot/utils/web_scraping_mixin.py:
web_check:
"Unsupported attribute: %s": "Nicht unterstütztes Attribut: %s"
web_find:
"Unsupported selector type: %s": "Nicht unterstützter Selektor-Typ: %s"
web_find_all:
"Unsupported selector type: %s": "Nicht unterstützter Selektor-Typ: %s"
close_browser_session:
"Closing Browser session...": "Schließe Browser-Sitzung..."
@@ -417,6 +414,12 @@ kleinanzeigen_bot/utils/web_scraping_mixin.py:
web_request:
" -> HTTP %s [%s]...": " -> HTTP %s [%s]..."
_web_find_once:
"Unsupported selector type: %s": "Nicht unterstützter Selektor-Typ: %s"
_web_find_all_once:
"Unsupported selector type: %s": "Nicht unterstützter Selektor-Typ: %s"
diagnose_browser_issues:
"=== Browser Connection Diagnostics ===": "=== Browser-Verbindungsdiagnose ==="
"=== End Diagnostics ===": "=== Ende der Diagnose ==="
@@ -434,6 +437,8 @@ kleinanzeigen_bot/utils/web_scraping_mixin.py:
"(info) Remote debugging port configured: %d": "(Info) Remote-Debugging-Port konfiguriert: %d"
"(info) Remote debugging port is not open": "(Info) Remote-Debugging-Port ist nicht offen"
"(warn) Unable to inspect browser processes: %s": "(Warnung) Browser-Prozesse konnten nicht überprüft werden: %s"
"(info) No browser processes currently running": "(Info) Derzeit keine Browser-Prozesse aktiv"
"(fail) Running as root - this can cause browser issues": "(Fehler) Läuft als Root - dies kann Browser-Probleme verursachen"