fix: increase login detection timeout to fix intermittent failures (#701) (#726)

## ℹ️ Description

This PR fixes intermittent login detection failures where the bot fails
to detect existing login sessions and unnecessarily re-logins,
potentially causing IP blocks.

- Link to the related issue(s): Issue #701
- Describe the motivation and context for this change:

Users reported that the bot sometimes fails to detect existing login
sessions (50/50 behavior), especially for browser profiles that haven't
been used for 20+ days. This appears to be a race condition where:
1. `web_open()` completes when `document.readyState == 'complete'`
2. But kleinanzeigen.de's client-side JavaScript hasn't yet rendered
user profile elements
3. The login detection timeout (5s default) is too short for slow
networks or sessions requiring server-side validation

## 📋 Changes Summary

- **Add dedicated `login_detection` timeout** to `TimeoutConfig`
(default: 10s, previously used generic 5s timeout)
- **Apply timeout to both DOM checks** in `is_logged_in()`: `.mr-medium`
and `#user-email` elements
- **Add debug logging** to track which element detected login or if no
login was found
- **Regenerate JSON schema** to include new timeout configuration
- **Effective total timeout**: ~22.5s (10s base × 1.0 multiplier × 1.5
backoff × 2 retries) vs previous ~11.25s

### Benefits:
- Addresses race condition between page load completion and client-side
rendering
- Provides sufficient time for sessions requiring server-side validation
(20+ days old)
- User-configurable via `timeouts.login_detection` in `config.yaml`
- Follows established pattern of dedicated timeouts (`sms_verification`,
`gdpr_prompt`, etc.)

### ⚙️ Type of Change
- [x] 🐞 Bug fix (non-breaking change which fixes an issue)
- [ ]  New feature (adds new functionality without breaking existing
usage)
- [ ] 💥 Breaking change (changes that might break existing user setups,
scripts, or configurations)


##  Checklist
- [x] I have reviewed my changes to ensure they meet the project's
standards.
- [x] I have tested my changes and ensured that all tests pass (`pdm run
test`).
- [x] I have formatted the code (`pdm run format`).
- [x] I have verified that linting passes (`pdm run lint`).
- [x] I have updated documentation where necessary.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Added a configurable login-detection timeout (default 10s, min 1s) to
tune session detection.

* **Bug Fixes**
* More reliable login checks using a timeout-aware, two-step detection
sequence.
* Improved diagnostic logging for login attempts, retry behavior,
detection outcomes, and timeout events.

* **Documentation**
* Added troubleshooting guidance explaining the login-detection timeout
and when to adjust it.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
This commit is contained in:
Jens
2025-12-16 21:30:40 +01:00
committed by GitHub
parent ce833b9350
commit 25079c32c0
6 changed files with 77 additions and 10 deletions

View File

@@ -292,6 +292,7 @@ timeouts:
captcha_detection: 2.0 # Timeout for captcha iframe detection captcha_detection: 2.0 # Timeout for captcha iframe detection
sms_verification: 4.0 # Timeout for SMS verification banners sms_verification: 4.0 # Timeout for SMS verification banners
gdpr_prompt: 10.0 # Timeout when handling GDPR dialogs gdpr_prompt: 10.0 # Timeout when handling GDPR dialogs
login_detection: 10.0 # Timeout for detecting existing login session via DOM elements
publishing_result: 300.0 # Timeout for publishing status checks publishing_result: 300.0 # Timeout for publishing status checks
publishing_confirmation: 20.0 # Timeout for publish confirmation redirect publishing_confirmation: 20.0 # Timeout for publish confirmation redirect
image_upload: 30.0 # Timeout for image upload and server-side processing image_upload: 30.0 # Timeout for image upload and server-side processing

View File

@@ -71,6 +71,29 @@ The bot will also provide specific instructions on how to fix your configuration
2. Override specific keys under `timeouts` (e.g. `pagination_initial: 20.0`) if only a single selector is problematic. 2. Override specific keys under `timeouts` (e.g. `pagination_initial: 20.0`) if only a single selector is problematic.
3. Keep `retry_enabled` on so that DOM lookups are retried with exponential backoff. 3. Keep `retry_enabled` on so that DOM lookups are retried with exponential backoff.
### Issue: Bot fails to detect existing login session
**Symptoms:**
- Bot re-logins despite being already authenticated
- Intermittent (50/50) login detection behavior
- More common with profiles unused for 20+ days
**What `login_detection` controls:**
- Maximum time (seconds) to wait for user profile DOM elements when checking if already logged in
- Default: `10.0` seconds (provides ~22.5s total with retry/backoff)
- Used at startup before attempting login
**When to increase `login_detection`:**
- Frequent unnecessary re-logins despite being authenticated
- Slow or unstable network connection
- Using browser profiles that haven't been active for weeks
**Example:**
```yaml
timeouts:
login_detection: 15.0 # For slower networks or old sessions
```
## Common Issues and Solutions ## Common Issues and Solutions
### Issue 1: "Failed to connect to browser" with "root" error ### Issue 1: "Failed to connect to browser" with "root" error

View File

@@ -403,6 +403,13 @@
"title": "Gdpr Prompt", "title": "Gdpr Prompt",
"type": "number" "type": "number"
}, },
"login_detection": {
"default": 10.0,
"description": "Timeout for detecting existing login session via DOM elements.",
"minimum": 1.0,
"title": "Login Detection",
"type": "number"
},
"publishing_result": { "publishing_result": {
"default": 300.0, "default": 300.0,
"description": "Timeout for publishing result checks.", "description": "Timeout for publishing result checks.",

View File

@@ -595,6 +595,8 @@ class KleinanzeigenBot(WebScrapingMixin):
async def login(self) -> None: async def login(self) -> None:
LOG.info("Checking if already logged in...") LOG.info("Checking if already logged in...")
await self.web_open(f"{self.root_url}") await self.web_open(f"{self.root_url}")
if getattr(self, "page", None) is not None:
LOG.debug(_("Current page URL after opening homepage: %s"), self.page.url)
if await self.is_logged_in(): if await self.is_logged_in():
LOG.info("Already logged in as [%s]. Skipping login.", self.config.login.username) LOG.info("Already logged in as [%s]. Skipping login.", self.config.login.username)
@@ -608,9 +610,15 @@ class KleinanzeigenBot(WebScrapingMixin):
# Sometimes a second login is required # Sometimes a second login is required
if not await self.is_logged_in(): if not await self.is_logged_in():
LOG.debug(_("First login attempt did not succeed, trying second login attempt"))
await self.fill_login_data_and_send() await self.fill_login_data_and_send()
await self.handle_after_login_logic() await self.handle_after_login_logic()
if await self.is_logged_in():
LOG.debug(_("Second login attempt succeeded"))
else:
LOG.warning(_("Second login attempt also failed - login may not have succeeded"))
async def fill_login_data_and_send(self) -> None: async def fill_login_data_and_send(self) -> None:
LOG.info("Logging in as [%s]...", self.config.login.username) LOG.info("Logging in as [%s]...", self.config.login.username)
await self.web_input(By.ID, "login-email", self.config.login.username) await self.web_input(By.ID, "login-email", self.config.login.username)
@@ -646,19 +654,34 @@ class KleinanzeigenBot(WebScrapingMixin):
pass pass
async def is_logged_in(self) -> bool: async def is_logged_in(self) -> bool:
try: # Use login_detection timeout (10s default) instead of default (5s)
# to allow sufficient time for client-side JavaScript rendering after page load.
# This is especially important for older sessions (20+ days) that require
# additional server-side validation time.
login_check_timeout = self._timeout("login_detection")
effective_timeout = self._effective_timeout("login_detection")
username = self.config.login.username.lower()
LOG.debug(_("Starting login detection (timeout: %.1fs base, %.1fs effective with multiplier/backoff)"), login_check_timeout, effective_timeout)
# Try to find the standard element first # Try to find the standard element first
user_info = await self.web_text(By.CLASS_NAME, "mr-medium")
if self.config.login.username.lower() in user_info.lower():
return True
except TimeoutError:
try: try:
# If standard element not found, try the alternative user_info = await self.web_text(By.CLASS_NAME, "mr-medium", timeout = login_check_timeout)
user_info = await self.web_text(By.ID, "user-email") if username in user_info.lower():
if self.config.login.username.lower() in user_info.lower(): LOG.debug(_("Login detected via .mr-medium element"))
return True return True
except TimeoutError: except TimeoutError:
return False LOG.debug(_("Timeout waiting for .mr-medium element after %.1fs"), effective_timeout)
# If standard element not found or didn't contain username, try the alternative
try:
user_info = await self.web_text(By.ID, "user-email", timeout = login_check_timeout)
if username in user_info.lower():
LOG.debug(_("Login detected via #user-email element"))
return True
except TimeoutError:
LOG.debug(_("Timeout waiting for #user-email element after %.1fs"), effective_timeout)
LOG.debug(_("No login detected - neither .mr-medium nor #user-email found with username"))
return False return False
async def delete_ads(self, ad_cfgs:list[tuple[str, Ad, dict[str, Any]]]) -> None: async def delete_ads(self, ad_cfgs:list[tuple[str, Ad, dict[str, Any]]]) -> None:

View File

@@ -125,6 +125,7 @@ class TimeoutConfig(ContextualModel):
captcha_detection:float = Field(default = 2.0, ge = 0.1, description = "Timeout for captcha iframe detection.") captcha_detection:float = Field(default = 2.0, ge = 0.1, description = "Timeout for captcha iframe detection.")
sms_verification:float = Field(default = 4.0, ge = 0.1, description = "Timeout for SMS verification prompts.") sms_verification:float = Field(default = 4.0, ge = 0.1, description = "Timeout for SMS verification prompts.")
gdpr_prompt:float = Field(default = 10.0, ge = 1.0, description = "Timeout for GDPR/consent dialogs.") gdpr_prompt:float = Field(default = 10.0, ge = 1.0, description = "Timeout for GDPR/consent dialogs.")
login_detection:float = Field(default = 10.0, ge = 1.0, description = "Timeout for detecting existing login session via DOM elements.")
publishing_result:float = Field(default = 300.0, ge = 10.0, description = "Timeout for publishing result checks.") publishing_result:float = Field(default = 300.0, ge = 10.0, description = "Timeout for publishing result checks.")
publishing_confirmation:float = Field(default = 20.0, ge = 1.0, description = "Timeout for publish confirmation redirect.") publishing_confirmation:float = Field(default = 20.0, ge = 1.0, description = "Timeout for publish confirmation redirect.")
image_upload:float = Field(default = 30.0, ge = 5.0, description = "Timeout for image upload and server-side processing.") image_upload:float = Field(default = 30.0, ge = 5.0, description = "Timeout for image upload and server-side processing.")

View File

@@ -61,8 +61,20 @@ kleinanzeigen_bot/__init__.py:
login: login:
"Checking if already logged in...": "Überprüfe, ob bereits eingeloggt..." "Checking if already logged in...": "Überprüfe, ob bereits eingeloggt..."
"Current page URL after opening homepage: %s": "Aktuelle Seiten-URL nach dem Öffnen der Startseite: %s"
"Already logged in as [%s]. Skipping login.": "Bereits eingeloggt als [%s]. Überspringe Anmeldung." "Already logged in as [%s]. Skipping login.": "Bereits eingeloggt als [%s]. Überspringe Anmeldung."
"Opening login page...": "Öffne Anmeldeseite..." "Opening login page...": "Öffne Anmeldeseite..."
"First login attempt did not succeed, trying second login attempt": "Erster Anmeldeversuch war nicht erfolgreich, versuche zweiten Anmeldeversuch"
"Second login attempt succeeded": "Zweiter Anmeldeversuch erfolgreich"
"Second login attempt also failed - login may not have succeeded": "Zweiter Anmeldeversuch ebenfalls fehlgeschlagen - Anmeldung möglicherweise nicht erfolgreich"
is_logged_in:
"Starting login detection (timeout: %.1fs base, %.1fs effective with multiplier/backoff)": "Starte Login-Erkennung (Timeout: %.1fs Basis, %.1fs effektiv mit Multiplikator/Backoff)"
"Login detected via .mr-medium element": "Login erkannt über .mr-medium Element"
"Timeout waiting for .mr-medium element after %.1fs": "Timeout beim Warten auf .mr-medium Element nach %.1fs"
"Login detected via #user-email element": "Login erkannt über #user-email Element"
"Timeout waiting for #user-email element after %.1fs": "Timeout beim Warten auf #user-email Element nach %.1fs"
"No login detected - neither .mr-medium nor #user-email found with username": "Kein Login erkannt - weder .mr-medium noch #user-email mit Benutzername gefunden"
handle_after_login_logic: handle_after_login_logic:
"# Device verification message detected. Please follow the instruction displayed in the Browser.": "# Nachricht zur Geräteverifizierung erkannt. Bitte den Anweisungen im Browser folgen." "# Device verification message detected. Please follow the instruction displayed in the Browser.": "# Nachricht zur Geräteverifizierung erkannt. Bitte den Anweisungen im Browser folgen."