mirror of https://github.com/Second-Hand-Friends/kleinanzeigen-bot.git synced 2026-03-12 02:31:45 +01:00

Files

Jens a4946ba104 docs: refactor guides for clearer navigation (#795 )

## ℹ️ Description
Refactors and reorganizes documentation to improve navigation and keep
the README concise.

- Link to the related issue(s): Issue #N/A
- Describe the motivation and context for this change.
- The README had grown long and duplicated detailed config/ad
references; this consolidates docs into focused guides and adds an
index.

## 📋 Changes Summary
- Add dedicated docs pages for configuration, ad configuration, update
checks, and a docs index.
- Slim README and CONTRIBUTING to reference dedicated guides and clean
up formatting/markdownlint issues.
- Refresh browser troubleshooting and update-check guidance; keep the
update channel name aligned with schema/implementation.
- Add markdownlint configuration for consistent docs formatting.

### ⚙️ Type of Change
Select the type(s) of change(s) included in this pull request:
- [ ] 🐞 Bug fix (non-breaking change which fixes an issue)
- [x] ✨ New feature (adds new functionality without breaking existing
usage)
- [ ] 💥 Breaking change (changes that might break existing user setups,
scripts, or configurations)


## ✅ Checklist
Before requesting a review, confirm the following:
- [x] I have reviewed my changes to ensure they meet the project's
standards.
- [x] I have tested my changes and ensured that all tests pass (`pdm run
test`).
- [x] I have formatted the code (`pdm run format`).
- [x] I have verified that linting passes (`pdm run lint`).
- [x] I have updated documentation where necessary.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Documentation**
* Reorganized and enhanced contributing guidelines with improved
structure and formatting
* Streamlined README with better organization and updated installation
instructions
* Added comprehensive configuration reference documentation for
configuration and ad settings
* Improved browser troubleshooting guide with updated guidance,
examples, and diagnostic information
  * Created new documentation index for easier navigation

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

2026-01-30 11:06:36 +01:00

19 KiB

Raw Blame History

Browser Connection Troubleshooting Guide

This guide helps you resolve common browser connection issues with the kleinanzeigen-bot.

⚠️ Important: Chrome 136+ Security Changes (March 2025)

If you're using Chrome 136 or later and remote debugging stopped working, this is likely the cause.

Google implemented security changes in Chrome 136 that require --user-data-dir to be specified when using --remote-debugging-port. This prevents attackers from accessing the default Chrome profile and stealing cookies/credentials.

Quick Fix

# Start Chrome with custom user data directory
chrome --remote-debugging-port=9222 --user-data-dir=/tmp/chrome-debug-profile

In your config.yaml

browser:
  arguments:
    - --remote-debugging-port=9222
    - --user-data-dir=/tmp/chrome-debug-profile  # Required for Chrome 136+
  user_data_dir: "/tmp/chrome-debug-profile"     # Must match the argument above

The bot will automatically detect Chrome 136+ and provide clear error messages if your configuration is missing the required --user-data-dir setting.

For more details, see Chrome 136+ Security Changes below.

Quick Diagnosis

Run the diagnostic command to automatically check your setup:

For binary users:

kleinanzeigen-bot diagnose

For source users:

pdm run app diagnose

This will check:

Browser binary availability and permissions
User data directory permissions
Remote debugging port status
Running browser processes
Platform-specific issues
Chrome/Edge version detection and configuration validation

Automatic Chrome 136+ Validation: The bot automatically detects Chrome/Edge 136+ and validates your configuration. If you're using Chrome 136+ with remote debugging but missing the required --user-data-dir setting, you'll see clear error messages like:

Chrome 136+ configuration validation failed: Chrome 136+ requires --user-data-dir
Please update your configuration to include --user-data-dir for remote debugging

The bot will also provide specific instructions on how to fix your configuration.

Issue: Slow page loads or recurring TimeoutError

Symptoms:

_extract_category_from_ad_page fails intermittently due to breadcrumb lookups timing out
Captcha/SMS/GDPR prompts appear right after a timeout
Requests to GitHub's API fail sporadically with timeout errors

Solutions:

Increase timeouts.multiplier in config.yaml (e.g., 2.0 doubles every timeout consistently).
Override specific keys under timeouts (e.g., pagination_initial: 20.0) if only a single selector is problematic.
For slow email verification prompts, raise timeouts.email_verification.
Keep retry_enabled on so that DOM lookups are retried with exponential backoff.

Symptoms:

Bot re-logins despite being already authenticated
Intermittent (50/50) login detection behavior
More common with profiles unused for 20+ days

How login detection works: The bot checks your login status using page elements first (to minimize bot-like behavior), with a fallback to a server-side request if needed.

The bot uses a DOM-based check as the primary method to detect login state:

DOM check (preferred - stealthy): Checks for user profile elements in the page
- Looks for .mr-medium element containing username
- Falls back to #user-email ID
- Uses the login_detection timeout (default: 10.0 seconds with effective timeout with retry/backoff)
- Minimizes bot detection by avoiding JSON API requests that normal users wouldn't trigger
Auth probe fallback (more reliable): Sends a GET request to {root_url}/m-meine-anzeigen-verwalten.json?sort=DEFAULT
- Returns LOGGED_IN if the response is HTTP 200 with valid JSON containing "ads" key
- Returns LOGGED_OUT if response is HTTP 401/403 or HTML contains login markers
- Returns UNKNOWN on timeouts, assertion failures, or unexpected response bodies
- Only used when DOM check is inconclusive (UNKNOWN or timed out)
Diagnostics capture: If the state remains UNKNOWN and diagnostics.login_detection_capture is enabled
- Captures a screenshot and HTML dump for troubleshooting
- Pauses for manual inspection if diagnostics.pause_on_login_detection_failure is enabled and running in an interactive terminal

What login_detection controls:

Maximum time (seconds) to wait for user profile DOM elements when checking if already logged in
Default: 10.0 seconds (effective timeout with retry/backoff)
Used at startup before attempting login
Note: With DOM-first order, this timeout applies to the primary DOM check path

When to increase login_detection:

Frequent unnecessary re-logins despite being authenticated
Slow or unstable network connection
Using browser profiles that haven't been active for weeks

⚠️ PII Warning: HTML dumps captured by diagnostics may contain your account email or other personally identifiable information. Review files in the diagnostics output directory before sharing them publicly.

Example:

timeouts:
  login_detection: 15.0  # For slower networks or old sessions

# Enable diagnostics when troubleshooting login detection issues
diagnostics:
  login_detection_capture: true  # Capture artifacts on UNKNOWN state
  pause_on_login_detection_failure: true  # Pause for inspection (interactive only)
  output_dir: "./diagnostics"  # Custom output directory (optional)

Common Issues and Solutions

Issue 1: "Failed to connect to browser" with "root" error

Symptoms:

Error message mentions "One of the causes could be when you are running as root"
Connection fails when using existing browser profiles

Causes:

Running the application as root user
Browser profile is locked or in use by another process
Insufficient permissions to access the browser profile
Browser is not properly started with remote debugging enabled

Solutions:

1. Don't run as root

# ❌ Don't do this
sudo pdm run app publish

# ✅ Do this instead
pdm run app publish

2. Close all browser instances

# On Linux/macOS
pkill -f chrome
pkill -f chromium
pkill -f msedge

# On Windows
taskkill /f /im chrome.exe
taskkill /f /im msedge.exe

3. Remove user_data_dir temporarily

Edit your config.yaml and comment out or remove the user_data_dir line:

browser:
  # user_data_dir: C:\Users\user\AppData\Local\Microsoft\Edge\User Data  # Comment this out
  profile_name: "Default"

4. Start browser manually with remote debugging

# For Chrome (macOS)
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222 --user-data-dir=/tmp/chrome-debug-profile

# For Chrome (Linux)
google-chrome --remote-debugging-port=9222 --user-data-dir=/tmp/chrome-debug-profile

# For Chrome (Windows)
"C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222 --user-data-dir=C:\temp\chrome-debug-profile

# For Edge (macOS)
/Applications/Microsoft\ Edge.app/Contents/MacOS/Microsoft\ Edge --remote-debugging-port=9222 --user-data-dir=/tmp/edge-debug-profile

# For Edge (Linux/Windows)
msedge --remote-debugging-port=9222 --user-data-dir=/tmp/edge-debug-profile

# For Chromium (Linux)
chromium --remote-debugging-port=9222 --user-data-dir=/tmp/chromium-debug-profile

Then in your config.yaml:

browser:
  arguments:
    - --remote-debugging-port=9222
    - --user-data-dir=/tmp/chrome-debug-profile  # Must match the command line
  user_data_dir: "/tmp/chrome-debug-profile"     # Must match the argument above

⚠️ IMPORTANT: Chrome 136+ Security Requirement

Starting with Chrome 136 (March 2025), Google has implemented security changes that require --user-data-dir to be specified when using --remote-debugging-port. This prevents attackers from accessing the default Chrome profile and stealing cookies/credentials. See Chrome's security announcement for more details.

Issue 2: "Browser process not reachable at 127.0.0.1:9222"

Symptoms:

Port check fails when trying to connect to existing browser
Browser appears to be running but connection fails

Causes:

Browser not started with remote debugging port
Port is blocked by firewall
Browser crashed or closed
Timing issue - browser not fully started
Browser update changed remote debugging behavior
Existing Chrome instance conflicts with new debugging session
Chrome 136+ security requirement not met (most common cause since March 2025)

Solutions:

1. Verify browser is started with remote debugging

Make sure your browser is started with the correct flag:

# Check if browser is running with remote debugging
netstat -an | grep 9222  # Linux/macOS
netstat -an | findstr 9222  # Windows

2. Start browser manually first

# Start browser with remote debugging
chrome --remote-debugging-port=9222 --user-data-dir=/tmp/chrome-debug

# Then run the bot
kleinanzeigen-bot publish  # For binary users
# or
pdm run app publish        # For source users

3. macOS-specific: Chrome started but connection fails

If you're on macOS and Chrome is started with remote debugging but the bot still can't connect:

⚠️ IMPORTANT: macOS Security Requirement

This is a Chrome/macOS security issue that requires a dedicated user data directory.

# Method 1: Use the full path to Chrome with dedicated user data directory
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome \
  --remote-debugging-port=9222 \
  --user-data-dir=/tmp/chrome-debug-profile \
  --disable-dev-shm-usage

# Method 2: Use open command with proper arguments
open -a "Google Chrome" --args \
  --remote-debugging-port=9222 \
  --user-data-dir=/tmp/chrome-debug-profile \
  --disable-dev-shm-usage

# Method 3: Check if Chrome is actually listening on the port
lsof -i :9222
curl http://localhost:9222/json/version

⚠️ CRITICAL: You must also configure the same user data directory in your config.yaml:

browser:
  arguments:
    - --remote-debugging-port=9222
    - --user-data-dir=/tmp/chrome-debug-profile
    - --disable-dev-shm-usage
  user_data_dir: "/tmp/chrome-debug-profile"

Common macOS issues:

Chrome/macOS security restrictions require a dedicated user data directory
The --user-data-dir flag is mandatory for remote debugging on macOS
Use --disable-dev-shm-usage to avoid shared memory issues
The user data directory must match between manual Chrome startup and config.yaml

4. Browser update issues

If it worked before but stopped working after a browser update:

# Check your browser version
# macOS
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --version

# Linux
google-chrome --version

# Windows
"C:\Program Files\Google\Chrome\Application\chrome.exe" --version

# Close all browser instances first
pkill -f "Google Chrome"  # macOS/Linux
# or
taskkill /f /im chrome.exe  # Windows

# Start fresh with proper flags (see macOS-specific section above for details)

After browser updates:

Chrome may have changed how remote debugging works
Security restrictions may have been updated
Try using a fresh user data directory to avoid conflicts
Ensure you're using the latest version of the bot

5. Chrome 136+ Security Changes (March 2025)

If you're using Chrome 136 or later and remote debugging stopped working:

The Problem: Google implemented security changes in Chrome 136 that prevent --remote-debugging-port from working with the default user data directory. This was done to protect users from cookie theft attacks.

The Solution: You must now specify a custom --user-data-dir when using remote debugging:

# ❌ This will NOT work with Chrome 136+
chrome --remote-debugging-port=9222

# ✅ This WILL work with Chrome 136+
chrome --remote-debugging-port=9222 --user-data-dir=/tmp/chrome-debug-profile

In your config.yaml:

browser:
  arguments:
    - --remote-debugging-port=9222
    - --user-data-dir=/tmp/chrome-debug-profile  # Required for Chrome 136+
  user_data_dir: "/tmp/chrome-debug-profile"     # Must match the argument above

Why this change was made:

Prevents attackers from accessing the default Chrome profile
Protects cookies and login credentials
Uses a different encryption key for the custom profile
Makes debugging more secure

For more information:

6. Check firewall settings

Windows: Check Windows Defender Firewall
macOS: Check System Preferences > Security & Privacy > Firewall
Linux: Check iptables or ufw settings

7. Use different port

Try a different port in case 9222 is blocked:

browser:
  arguments:
    - --remote-debugging-port=9223

Issue 3: Profile directory issues

Symptoms:

Errors about profile directory not found
Permission denied errors
Profile locked errors

Solutions:

1. Use temporary profile

browser:
  user_data_dir: "/tmp/chrome-temp"  # Linux/macOS
  # user_data_dir: "C:\\temp\\chrome-temp"  # Windows
  profile_name: "Default"

2. Check profile permissions

# Linux/macOS
ls -la ~/.config/google-chrome/
chmod 755 ~/.config/google-chrome/

# Windows
# Check folder permissions in Properties > Security

3. Remove profile temporarily

browser:
  # user_data_dir: ""  # Comment out or remove
  # profile_name: ""   # Comment out or remove
  use_private_window: true

Issue 4: Platform-specific issues

Windows

Antivirus software: Add browser executable to exclusions
Windows Defender: Add folder to exclusions
UAC: Run as administrator if needed (but not recommended)

macOS

Gatekeeper: Allow browser in System Preferences > Security & Privacy
SIP: System Integrity Protection might block some operations
Permissions: Grant full disk access to terminal/IDE

Linux

Sandbox: Add --no-sandbox to browser arguments
Root user: Never run as root, use regular user
Display: Ensure X11 or Wayland is properly configured

Configuration Examples

Basic working configuration

browser:
  arguments:
    - --disable-dev-shm-usage
    - --no-sandbox
  use_private_window: true

Using existing browser

browser:
  arguments:
    - --remote-debugging-port=9222
    - --user-data-dir=/tmp/chrome-debug-profile  # Required for Chrome 136+
  user_data_dir: "/tmp/chrome-debug-profile"     # Must match the argument above
  binary_location: "C:\\Program Files\\Google\\Chrome\\Application\\chrome.exe"

Using existing browser on macOS (REQUIRED configuration)

browser:
  arguments:
    - --remote-debugging-port=9222
    - --user-data-dir=/tmp/chrome-debug-profile
    - --disable-dev-shm-usage
  user_data_dir: "/tmp/chrome-debug-profile"
  binary_location: "/Applications/Google Chrome.app/Contents/MacOS/Google Chrome"

Using specific profile

browser:
  user_data_dir: "C:\\Users\\username\\AppData\\Local\\Google\\Chrome\\User Data"
  profile_name: "Profile 1"
  arguments:
    - --disable-dev-shm-usage

Advanced Troubleshooting

Check browser compatibility

# Test if browser can be started manually
# macOS
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --version
/Applications/Microsoft\ Edge.app/Contents/MacOS/Microsoft\ Edge --version

# Linux
google-chrome --version
msedge --version
chromium --version

# Windows
"C:\Program Files\Google\Chrome\Application\chrome.exe" --version
msedge --version

Monitor browser processes

# Linux/macOS
ps aux | grep chrome
lsof -i :9222

# Windows
tasklist | findstr chrome
netstat -an | findstr 9222

Debug with verbose logging

kleinanzeigen-bot -v publish  # For binary users
# or
pdm run app -v publish        # For source users

Test browser connection manually

# Test if port is accessible
curl http://localhost:9222/json/version

Using an Existing Browser Window

By default a new browser process will be launched. To reuse a manually launched browser window/process, follow these steps:

Manually launch your browser from the command line with the --remote-debugging-port=<NUMBER> flag. You are free to choose an unused port number between 1025 and 65535, for example:
- chrome --remote-debugging-port=9222
- chromium --remote-debugging-port=9222
- msedge --remote-debugging-port=9222
This runs the browser in debug mode which allows it to be remote controlled by the bot.

⚠️ IMPORTANT: Chrome 136+ Security Requirement

Starting with Chrome 136 (March 2025), Google has implemented security changes that require --user-data-dir to be specified when using --remote-debugging-port. This prevents attackers from accessing the default Chrome profile and stealing cookies/credentials.

You must now use:
```
chrome --remote-debugging-port=9222 --user-data-dir=/path/to/custom/directory
```
And in your config.yaml:
```
browser:
  arguments:
    - --remote-debugging-port=9222
    - --user-data-dir=/path/to/custom/directory
  user_data_dir: "/path/to/custom/directory"
```
The bot will automatically detect Chrome 136+ and validate your configuration. If validation fails, you'll see clear error messages with specific instructions on how to fix your configuration.

In your config.yaml specify the same flags as browser arguments, for example:

browser:
  arguments:
  - --remote-debugging-port=9222
  - --user-data-dir=/tmp/chrome-debug-profile  # Required for Chrome 136+
  user_data_dir: "/tmp/chrome-debug-profile"   # Must match the argument above

When now publishing ads the manually launched browser will be re-used.

NOTE: If an existing browser is used all other settings configured under browser in your config.yaml file will be ignored because they are only used to programmatically configure/launch a dedicated browser instance.

Security Note: This change was implemented by Google to protect users from cookie theft attacks. The custom user data directory uses a different encryption key than the default profile, making it more secure for debugging purposes.

Getting Help

If you're still experiencing issues:

Run the diagnostic command: kleinanzeigen-bot diagnose (binary) or pdm run app diagnose (source)
Check the log file for detailed error messages
Try the solutions above step by step
Create an issue on GitHub with:
- Output from the diagnose command
- Your config.yaml (remove sensitive information)
- Error messages from the log file
- Operating system and browser version

Prevention

To avoid browser connection issues:

Don't run as root - Always use a regular user account
Close other browser instances - Ensure no other browser processes are running
Use temporary profiles - Avoid conflicts with existing browser sessions
Keep browser updated - Use the latest stable version
Check permissions - Ensure proper file and folder permissions
Monitor system resources - Ensure sufficient memory and disk space

19 KiB Raw Blame History

Browser Connection Troubleshooting Guide

⚠️ Important: Chrome 136+ Security Changes (March 2025)

Quick Fix

In your config.yaml

Quick Diagnosis

Issue: Slow page loads or recurring TimeoutError

Issue: Bot fails to detect existing login session

Common Issues and Solutions

Issue 1: "Failed to connect to browser" with "root" error

1. Don't run as root

2. Close all browser instances

3. Remove user_data_dir temporarily

4. Start browser manually with remote debugging

⚠️ IMPORTANT: Chrome 136+ Security Requirement

Issue 2: "Browser process not reachable at 127.0.0.1:9222"

1. Verify browser is started with remote debugging

2. Start browser manually first

3. macOS-specific: Chrome started but connection fails

⚠️ IMPORTANT: macOS Security Requirement

4. Browser update issues

5. Chrome 136+ Security Changes (March 2025)

6. Check firewall settings

7. Use different port

Issue 3: Profile directory issues

1. Use temporary profile

2. Check profile permissions

3. Remove profile temporarily

Issue 4: Platform-specific issues

Windows

macOS

Linux

Configuration Examples

Basic working configuration

Using existing browser

Using existing browser on macOS (REQUIRED configuration)

Using specific profile

Advanced Troubleshooting

Check browser compatibility

Monitor browser processes

Debug with verbose logging

Test browser connection manually

Using an Existing Browser Window

Getting Help

Prevention

19 KiB

Raw Blame History