feat: add configurable timeouts (#673)

## ℹ️ Description
- Related issues: #671, #658
- Introduces configurable timeout controls plus retry/backoff handling
for flaky DOM operations.

We often see timeouts which are note reproducible in certain
configurations. I suspect timeout issues based on a combination of
internet speed, browser, os, age of the computer and the weather.

This PR introduces a comprehensive config model to tweak timeouts.

## 📋 Changes Summary
- add TimeoutConfig to the main config/schema and expose timeouts in
README/docs
- wire WebScrapingMixin, extractor, update checker, and browser
diagnostics to honor the configurable timeouts and retries
- update translations/tests to cover the new behaviour and ensure
lint/mypy/pyright pipelines remain green

### ⚙️ Type of Change
- [ ] 🐞 Bug fix (non-breaking change which fixes an issue)
- [x]  New feature (adds new functionality without breaking existing
usage)
- [ ] 💥 Breaking change (changes that might break existing user setups,
scripts, or configurations)

##  Checklist
- [x] I have reviewed my changes to ensure they meet the project's
standards.
- [x] I have tested my changes and ensured that all tests pass (`pdm run
test`).
- [x] I have formatted the code (`pdm run format`).
- [x] I have verified that linting passes (`pdm run lint`).
- [x] I have updated documentation where necessary.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Centralized, configurable timeout system for web interactions,
detection flows, publishing, and pagination.
* Optional retry with exponential backoff for operations that time out.

* **Improvements**
* Replaced fixed wait times with dynamic timeouts throughout workflows.
  * More informative timeout-related messages and diagnostics.

* **Tests**
* New and expanded test coverage for timeout behavior, pagination,
diagnostics, and retry logic.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
This commit is contained in:
Jens
2025-11-13 15:08:52 +01:00
committed by GitHub
parent ac678ed888
commit a3ac27c441
16 changed files with 972 additions and 121 deletions

View File

@@ -1,7 +1,9 @@
# SPDX-FileCopyrightText: © Sebastian Thomschke and contributors
# SPDX-License-Identifier: AGPL-3.0-or-later
# SPDX-ArtifactOfProjectHomePage: https://github.com/Second-Hand-Friends/kleinanzeigen-bot/
from kleinanzeigen_bot.model.config_model import AdDefaults, Config
import pytest
from kleinanzeigen_bot.model.config_model import AdDefaults, Config, TimeoutConfig
def test_migrate_legacy_description_prefix() -> None:
@@ -74,3 +76,50 @@ def test_minimal_config_validation() -> None:
config = Config.model_validate(minimal_cfg)
assert config.login.username == "dummy"
assert config.login.password == "dummy" # noqa: S105
def test_timeout_config_defaults_and_effective_values() -> None:
cfg = Config.model_validate({
"login": {"username": "dummy", "password": "dummy"}, # noqa: S105
"timeouts": {
"multiplier": 2.0,
"pagination_initial": 12.0,
"retry_max_attempts": 3,
"retry_backoff_factor": 2.0
}
})
timeouts = cfg.timeouts
base = timeouts.resolve("pagination_initial")
multiplier = timeouts.multiplier
backoff = timeouts.retry_backoff_factor
assert base == 12.0
assert timeouts.effective("pagination_initial") == base * multiplier * (backoff ** 0)
# attempt 1 should apply backoff factor once in addition to multiplier
assert timeouts.effective("pagination_initial", attempt = 1) == base * multiplier * (backoff ** 1)
def test_validate_glob_pattern_rejects_blank_strings() -> None:
with pytest.raises(ValueError, match = "must be a non-empty, non-blank glob pattern"):
Config.model_validate({
"ad_files": [" "],
"ad_defaults": {"contact": {"name": "dummy", "zipcode": "12345"}},
"login": {"username": "dummy", "password": "dummy"}
})
cfg = Config.model_validate({
"ad_files": ["*.yaml"],
"ad_defaults": {"contact": {"name": "dummy", "zipcode": "12345"}},
"login": {"username": "dummy", "password": "dummy"}
})
assert cfg.ad_files == ["*.yaml"]
def test_timeout_config_resolve_returns_specific_value() -> None:
timeouts = TimeoutConfig(default = 4.0, page_load = 12.5)
assert timeouts.resolve("page_load") == 12.5
def test_timeout_config_resolve_falls_back_to_default() -> None:
timeouts = TimeoutConfig(default = 3.0)
assert timeouts.resolve("nonexistent_key") == 3.0