feat: collect timeout timing sessions for diagnostics (#814)

This commit is contained in:
Jens
2026-02-13 16:45:52 +01:00
committed by GitHub
parent 81c55316db
commit 50fc8781a9
12 changed files with 902 additions and 350 deletions

View File

@@ -78,6 +78,11 @@ The bot will also provide specific instructions on how to fix your configuration
1. Override specific keys under `timeouts` (e.g., `pagination_initial: 20.0`) if only a single selector is problematic.
1. For slow email verification prompts, raise `timeouts.email_verification`.
1. Keep `retry_enabled` on so that DOM lookups are retried with exponential backoff.
1. Attach `timing_data.json` when opening issues so maintainers can tune defaults from real-world timing evidence.
- It is written automatically during runs when `diagnostics.timing_collection` is enabled (default: `true`, see `CONFIGURATION.md`).
- Portable mode path: `./.temp/timing/timing_data.json`
- User directories mode path: `~/.cache/kleinanzeigen-bot/timing/timing_data.json` (Linux), `~/Library/Caches/kleinanzeigen-bot/timing/timing_data.json` (macOS), or `%LOCALAPPDATA%\kleinanzeigen-bot\timing\timing_data.json` (Windows)
- Which one applies depends on your installation mode: portable mode writes next to your config/current directory, user directories mode writes in OS-standard user paths. Check which path exists on your system, or see `CONFIGURATION.md#installation-modes` for mode selection details.
### Issue: Bot fails to detect existing login session

View File

@@ -262,6 +262,7 @@ diagnostics:
publish: false # Capture screenshot + HTML + JSON on each failed publish attempt (timeouts/protocol errors)
capture_log_copy: false # Copy entire bot log file when diagnostics are captured (may duplicate log content)
pause_on_login_detection_failure: false # Pause for manual inspection (interactive only)
timing_collection: true # Collect timeout timing data locally for troubleshooting and tuning
output_dir: "" # Custom output directory (see "Output locations (default)" below)
```
@@ -309,7 +310,44 @@ The bot uses a layered approach to detect login state, prioritizing stealth over
- **User directories mode**: `~/.cache/kleinanzeigen-bot/diagnostics/` (Linux), `~/Library/Caches/kleinanzeigen-bot/diagnostics/` (macOS), or `%LOCALAPPDATA%\kleinanzeigen-bot\Cache\diagnostics\` (Windows)
- **Custom**: Path resolved relative to your `config.yaml` if `output_dir` is specified
> **⚠️ PII Warning:** HTML dumps, JSON payloads, and log copies may contain PII. Typical examples include account email, ad titles/descriptions, contact info, and prices. Log copies are produced by `capture_log_copy` when diagnostics capture runs, such as `capture_on.publish` or `capture_on.login_detection`. Review or redact these artifacts before sharing them publicly.
**Timing collection output (default):**
- **Portable mode**: `./.temp/timing/timing_data.json`
- **User directories mode**: `~/.cache/kleinanzeigen-bot/timing/timing_data.json` (Linux) or `~/Library/Caches/kleinanzeigen-bot/timing/timing_data.json` (macOS)
- Data is grouped by run/session and retained for 30 days via automatic cleanup during each data write
Example structure:
```json
[
{
"session_id": "abc12345",
"command": "publish",
"started_at": "2026-02-07T10:00:00+01:00",
"ended_at": "2026-02-07T10:04:30+01:00",
"records": [
{
"operation_key": "default",
"operation_type": "web_find",
"effective_timeout_sec": 5.0,
"actual_duration_sec": 1.2,
"attempt_index": 0,
"success": true
}
]
}
]
```
How to read it quickly:
- Group by `command` and `session_id` first to compare slow vs fast runs
- Look for high `actual_duration_sec` values near `effective_timeout_sec` and repeated `success: false` entries
- `attempt_index` is zero-based (`0` first attempt, `1` first retry)
- Use `operation_key` + `operation_type` to identify which timeout bucket (`default`, `page_load`, etc.) needs tuning
- For deeper timeout tuning workflow, see [Browser Troubleshooting](./BROWSER_TROUBLESHOOTING.md)
> **⚠️ PII Warning:** HTML dumps, JSON payloads, timing data JSON files (for example `timing_data.json`), and log copies may contain PII. Typical examples include account email, ad titles/descriptions, contact info, and prices. Log copies are produced by `capture_log_copy` when diagnostics capture runs, such as `capture_on.publish` or `capture_on.login_detection`. Review or redact these artifacts before sharing them publicly.
## Installation Modes