Commit Graph

10 Commits

Author SHA1 Message Date
Jens
25079c32c0 fix: increase login detection timeout to fix intermittent failures (#701) (#726)
## ℹ️ Description

This PR fixes intermittent login detection failures where the bot fails
to detect existing login sessions and unnecessarily re-logins,
potentially causing IP blocks.

- Link to the related issue(s): Issue #701
- Describe the motivation and context for this change:

Users reported that the bot sometimes fails to detect existing login
sessions (50/50 behavior), especially for browser profiles that haven't
been used for 20+ days. This appears to be a race condition where:
1. `web_open()` completes when `document.readyState == 'complete'`
2. But kleinanzeigen.de's client-side JavaScript hasn't yet rendered
user profile elements
3. The login detection timeout (5s default) is too short for slow
networks or sessions requiring server-side validation

## 📋 Changes Summary

- **Add dedicated `login_detection` timeout** to `TimeoutConfig`
(default: 10s, previously used generic 5s timeout)
- **Apply timeout to both DOM checks** in `is_logged_in()`: `.mr-medium`
and `#user-email` elements
- **Add debug logging** to track which element detected login or if no
login was found
- **Regenerate JSON schema** to include new timeout configuration
- **Effective total timeout**: ~22.5s (10s base × 1.0 multiplier × 1.5
backoff × 2 retries) vs previous ~11.25s

### Benefits:
- Addresses race condition between page load completion and client-side
rendering
- Provides sufficient time for sessions requiring server-side validation
(20+ days old)
- User-configurable via `timeouts.login_detection` in `config.yaml`
- Follows established pattern of dedicated timeouts (`sms_verification`,
`gdpr_prompt`, etc.)

### ⚙️ Type of Change
- [x] 🐞 Bug fix (non-breaking change which fixes an issue)
- [ ]  New feature (adds new functionality without breaking existing
usage)
- [ ] 💥 Breaking change (changes that might break existing user setups,
scripts, or configurations)


##  Checklist
- [x] I have reviewed my changes to ensure they meet the project's
standards.
- [x] I have tested my changes and ensured that all tests pass (`pdm run
test`).
- [x] I have formatted the code (`pdm run format`).
- [x] I have verified that linting passes (`pdm run lint`).
- [x] I have updated documentation where necessary.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Added a configurable login-detection timeout (default 10s, min 1s) to
tune session detection.

* **Bug Fixes**
* More reliable login checks using a timeout-aware, two-step detection
sequence.
* Improved diagnostic logging for login attempts, retry behavior,
detection outcomes, and timeout events.

* **Documentation**
* Added troubleshooting guidance explaining the login-detection timeout
and when to adjust it.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-12-16 21:30:40 +01:00
Jens
9ed87ff17f fix: wait for image upload completion before submitting ad (#716)
## ℹ️ Description
Fixes a race condition where ads were submitted before all images
finished uploading to the server, causing some images to be missing from
published ads.

- Link to the related issue(s): Issue #715
- The bot was submitting ads immediately after the last image
`send_file()` call completed, only waiting 1-2.5 seconds via
`web_sleep()`. This wasn't enough time for server-side image processing,
thumbnail generation, and DOM updates to complete, resulting in missing
images in published ads.

## 📋 Changes Summary

### Image Upload Verification (Initial Fix)
- Added thumbnail verification in `__upload_images()` method to wait for
all image thumbnails to appear in the DOM after upload
- Added configurable timeout `image_upload` to `TimeoutConfig` (default:
30s, minimum: 5s)
- Improved error messages to show expected vs actual image count when
upload times out
- Added German translations for new log messages and error messages
- Regenerated JSON schemas to include new timeout configuration

### Polling Performance & Crash Fix (Follow-up Fix)
- Fixed critical bug where `web_find_all()` would raise `TimeoutError`
when no thumbnails exist yet, causing immediate crash
- Wrapped DOM queries in `try/except TimeoutError` blocks to handle
empty results gracefully
- Changed polling to use `self._timeout("quick_dom")` (~1s with PR #718)
instead of default timeout
- Improved polling performance: reduced cycle time from ~2s to ~1.5s
- DOM queries are client-side only (no server load from frequent
polling)

**New configuration option:**
```yaml
timeouts:
  image_upload: 30.0  # Total timeout for image upload and server-side processing
  quick_dom: 1.0      # Per-poll timeout for thumbnail checks (adjustable via multiplier)
```

The bot now polls the DOM for `ul#j-pictureupload-thumbnails >
li.ui-sortable-handle` elements after uploading images, ensuring
server-side processing is complete before submitting the ad form.

### ⚙️ Type of Change
- [x] 🐞 Bug fix (non-breaking change which fixes an issue)
- [ ]  New feature (adds new functionality without breaking existing
usage)
- [ ] 💥 Breaking change (changes that might break existing user setups,
scripts, or configurations)


##  Checklist
Before requesting a review, confirm the following:
- [x] I have reviewed my changes to ensure they meet the project's
standards.
- [x] I have tested my changes and ensured that all tests pass (`pdm run
test`).
- [x] I have formatted the code (`pdm run format`).
- [x] I have verified that linting passes (`pdm run lint`).
- [x] I have updated documentation where necessary.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Image uploads now verify completion by waiting for all uploaded
thumbnails to appear before proceeding.

* **Improvements**
  * Added a configurable image upload timeout (default 30s, minimum 5s).
* Improved timeout reporting: when thumbnails don’t appear in time, the
app returns clearer feedback showing expected vs. observed counts.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-12-10 14:04:42 +01:00
Jens
00fa0d359f feat: Add Chrome 136+ safe defaults for browser configuration (#717)
## ℹ️ Description

This PR updates the default browser configuration to be safe for
Chrome/Chromium 136+ out of the box.

Chrome 136+ (released March 2025) requires `--user-data-dir` to be
specified when using `--remote-debugging-port` for security reasons.
Since nodriver relies on remote debugging, the bot needs proper defaults
to avoid validation errors.

**Motivation:** Eliminate Chrome 136+ configuration validation errors
for fresh installations and ensure session persistence by default.

## 📋 Changes Summary

- Set `browser.arguments` default to include
`--user-data-dir=.temp/browser-profile`
- Set `browser.user_data_dir` default to `.temp/browser-profile`
(previously `None`)
- Regenerated JSON schema (`config.schema.json`) with new defaults

**Benefits:**
-  Chrome 136+ compatible out of the box (no validation errors)
-  Browser session/cookies persist across runs (better UX)
-  Consistent with existing `.temp` directory pattern (update state,
caches)
-  Already gitignored - no accidental commits of browser profiles

**No breaking changes:** Existing configs with explicit
`browser.arguments: []` continue to work (users can override defaults).

### ⚙️ Type of Change
- [x]  New feature (adds new functionality without breaking existing
usage)

##  Checklist
- [x] I have reviewed my changes to ensure they meet the project's
standards.
- [x] I have tested my changes and ensured that all tests pass (`pdm run
test`).
- [x] I have formatted the code (`pdm run format`).
- [x] I have verified that linting passes (`pdm run lint`).
- [x] I have updated documentation where necessary.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Bug Fixes**
* Standardized browser profile configuration with improved default user
data directory settings.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-12-07 01:07:35 +01:00
Jens
a3ac27c441 feat: add configurable timeouts (#673)
## ℹ️ Description
- Related issues: #671, #658
- Introduces configurable timeout controls plus retry/backoff handling
for flaky DOM operations.

We often see timeouts which are note reproducible in certain
configurations. I suspect timeout issues based on a combination of
internet speed, browser, os, age of the computer and the weather.

This PR introduces a comprehensive config model to tweak timeouts.

## 📋 Changes Summary
- add TimeoutConfig to the main config/schema and expose timeouts in
README/docs
- wire WebScrapingMixin, extractor, update checker, and browser
diagnostics to honor the configurable timeouts and retries
- update translations/tests to cover the new behaviour and ensure
lint/mypy/pyright pipelines remain green

### ⚙️ Type of Change
- [ ] 🐞 Bug fix (non-breaking change which fixes an issue)
- [x]  New feature (adds new functionality without breaking existing
usage)
- [ ] 💥 Breaking change (changes that might break existing user setups,
scripts, or configurations)

##  Checklist
- [x] I have reviewed my changes to ensure they meet the project's
standards.
- [x] I have tested my changes and ensured that all tests pass (`pdm run
test`).
- [x] I have formatted the code (`pdm run format`).
- [x] I have verified that linting passes (`pdm run lint`).
- [x] I have updated documentation where necessary.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Centralized, configurable timeout system for web interactions,
detection flows, publishing, and pagination.
* Optional retry with exponential backoff for operations that time out.

* **Improvements**
* Replaced fixed wait times with dynamic timeouts throughout workflows.
  * More informative timeout-related messages and diagnostics.

* **Tests**
* New and expanded test coverage for timeout behavior, pagination,
diagnostics, and retry logic.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-11-13 15:08:52 +01:00
Jens Bergmann
91a40b0116 feat: enhanced folder naming (#599) 2025-08-12 10:43:26 +02:00
Jens Bergmann
5430f5cdc6 feat: update check (#561)
feat(update-check): add robust update check with interval support, state management, and CLI integration

- Implement version and interval-based update checks with configurable settings
- Add CLI command `kleinanzeigen-bot update-check` for manual checks
- Introduce state file with versioning, UTC timestamps, and migration logic
- Validate and normalize intervals (1d–4w) with fallback for invalid values
- Ensure correct handling of timezones and elapsed checks
- Improve error handling, logging, and internationalization (i18n)
- Add comprehensive test coverage for config, interval logic, migration, and CLI
- Align default config, translations, and schema with new functionality
- Improve help command UX by avoiding config/log loading for `--help`
- Update documentation and README with full feature overview
2025-06-27 07:52:40 +02:00
sebthom
85bd5c2f2a fix: update config schema 2025-06-05 13:07:07 +02:00
sebthom
85a5cf5224 feat: improve content_hash calculation 2025-05-15 12:07:49 +02:00
sebthom
6ede14596d feat: add type safe Ad model 2025-05-15 12:07:49 +02:00
sebthom
1369da1c34 feat: add type safe Config model 2025-05-15 12:07:49 +02:00