docs: refactor guides for clearer navigation (#795)

## ℹ️ Description
Refactors and reorganizes documentation to improve navigation and keep
the README concise.

- Link to the related issue(s): Issue #N/A
- Describe the motivation and context for this change.
- The README had grown long and duplicated detailed config/ad
references; this consolidates docs into focused guides and adds an
index.

## 📋 Changes Summary
- Add dedicated docs pages for configuration, ad configuration, update
checks, and a docs index.
- Slim README and CONTRIBUTING to reference dedicated guides and clean
up formatting/markdownlint issues.
- Refresh browser troubleshooting and update-check guidance; keep the
update channel name aligned with schema/implementation.
- Add markdownlint configuration for consistent docs formatting.

### ⚙️ Type of Change
Select the type(s) of change(s) included in this pull request:
- [ ] 🐞 Bug fix (non-breaking change which fixes an issue)
- [x]  New feature (adds new functionality without breaking existing
usage)
- [ ] 💥 Breaking change (changes that might break existing user setups,
scripts, or configurations)


##  Checklist
Before requesting a review, confirm the following:
- [x] I have reviewed my changes to ensure they meet the project's
standards.
- [x] I have tested my changes and ensured that all tests pass (`pdm run
test`).
- [x] I have formatted the code (`pdm run format`).
- [x] I have verified that linting passes (`pdm run lint`).
- [x] I have updated documentation where necessary.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Documentation**
* Reorganized and enhanced contributing guidelines with improved
structure and formatting
* Streamlined README with better organization and updated installation
instructions
* Added comprehensive configuration reference documentation for
configuration and ad settings
* Improved browser troubleshooting guide with updated guidance,
examples, and diagnostic information
  * Created new documentation index for easier navigation

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
This commit is contained in:
Jens
2026-01-30 11:06:36 +01:00
committed by GitHub
parent 3dc24e1df7
commit a4946ba104
9 changed files with 954 additions and 561 deletions

314
docs/AD_CONFIGURATION.md Normal file
View File

@@ -0,0 +1,314 @@
# Ad Configuration Reference
Complete reference for ad YAML files in kleinanzeigen-bot.
## File Format
Each ad is described in a separate JSON or YAML file with the default `ad_` prefix (for example, `ad_laptop.yaml`). You can customize the prefix via the `ad_files` pattern in `config.yaml`.
Examples below use YAML, but JSON uses the same keys and structure.
Parameter values specified in the `ad_defaults` section of `config.yaml` don't need to be specified again in the ad configuration file.
## Quick Start
Generate sample ad files using the download command:
```bash
# Download all ads from your profile
kleinanzeigen-bot download --ads=all
# Download only new ads (not locally saved yet)
kleinanzeigen-bot download --ads=new
# Download specific ads by ID
kleinanzeigen-bot download --ads=1,2,3
```
For full JSON schema with IDE autocompletion support, see:
- [schemas/ad.schema.json](../schemas/ad.schema.json)
## Configuration Structure
### Basic Ad Properties
Description values can be multiline. See <https://yaml-multiline.info/> for YAML syntax examples.
```yaml
# yaml-language-server: $schema=https://raw.githubusercontent.com/Second-Hand-Friends/kleinanzeigen-bot/refs/heads/main/schemas/ad.schema.json
active: # true or false (default: true)
type: # one of: OFFER, WANTED (default: OFFER)
title: # Ad title
description: # Ad description
```
### Description Prefix and Suffix
You can add prefix and suffix text to your ad descriptions in two ways:
#### New Format (Recommended)
In your `config.yaml` file you can specify a `description_prefix` and `description_suffix` under the `ad_defaults` section:
```yaml
ad_defaults:
description_prefix: "Prefix text"
description_suffix: "Suffix text"
```
#### Legacy Format
In your ad configuration file you can specify a `description_prefix` and `description_suffix`:
```yaml
description_prefix: "Prefix text"
description_suffix: "Suffix text"
```
#### Precedence
The ad-level setting has precedence over the `config.yaml` default. If you specify both, the ad-level setting will be used. We recommend using the `config.yaml` defaults as it is more flexible and easier to manage.
### Category
Built-in category name, custom category name from `config.yaml`, or category ID.
```yaml
# Built-in category name (see default list at
# https://github.com/Second-Hand-Friends/kleinanzeigen-bot/blob/main/src/kleinanzeigen_bot/resources/categories.yaml)
category: "Elektronik > Notebooks"
# Custom category name (defined in config.yaml)
category: "Verschenken & Tauschen > Tauschen"
# Category ID
category: 161/278
```
### Price and Price Type
```yaml
price: # Price in euros; decimals allowed but will be rounded to nearest whole euro on processing
# (prefer whole euros for predictability)
price_type: # one of: FIXED, NEGOTIABLE, GIVE_AWAY (default: NEGOTIABLE)
```
### Automatic Price Reduction
When `auto_price_reduction.enabled` is set to `true`, the bot lowers the configured `price` every time the ad is reposted.
**Important:** Price reductions only apply when using the `publish` command (which deletes the old ad and creates a new one). Using the `update` command to modify ad content does NOT trigger price reductions or increment `repost_count`.
`repost_count` is tracked for every ad (and persisted inside the corresponding `ad_*.yaml`) so reductions continue across runs.
`min_price` is required whenever `enabled` is `true` and must be less than or equal to `price`; this makes an explicit floor (including `0`) mandatory. If `min_price` equals the current price, the bot will log a warning and perform no reduction.
**Note:** `repost_count` and price reduction counters are only incremented and persisted after a successful publish. Failed publish attempts do not advance the counters.
```yaml
auto_price_reduction:
enabled: # true or false to enable automatic price reduction on reposts (default: false)
strategy: # "PERCENTAGE" or "FIXED" (required when enabled is true)
amount: # Reduction amount; interpreted as percent for PERCENTAGE or currency units for FIXED
# (prefer whole euros for predictability)
min_price: # Required when enabled is true; minimum price floor
# (use 0 for no lower bound, prefer whole euros for predictability)
delay_reposts: # Number of reposts to wait before first reduction (default: 0)
delay_days: # Number of days to wait after publication before reductions (default: 0)
```
**Note:** All prices are rounded to whole euros after each reduction step.
#### PERCENTAGE Strategy Example
```yaml
price: 150
price_type: FIXED
auto_price_reduction:
enabled: true
strategy: PERCENTAGE
amount: 10
min_price: 90
delay_reposts: 0
delay_days: 0
```
This posts the ad at 150 € the first time, then 135 € (10%), 122 € (10%), 110 € (10%), 99 € (10%), and stops decreasing at 90 €.
**Note:** The bot applies commercial rounding (ROUND_HALF_UP) to full euros after each reduction step. For example, 121.5 rounds to 122, and 109.8 rounds to 110. This step-wise rounding affects the final price progression, especially for percentage-based reductions.
#### FIXED Strategy Example
```yaml
price: 150
price_type: FIXED
auto_price_reduction:
enabled: true
strategy: FIXED
amount: 15
min_price: 90
delay_reposts: 0
delay_days: 0
```
This posts the ad at 150 € the first time, then 135 € (15 €), 120 € (15 €), 105 € (15 €), and stops decreasing at 90 €.
#### Note on `delay_days` Behavior
The `delay_days` parameter counts complete 24-hour periods (whole days) since the ad was published. For example, if `delay_days: 7` and the ad was published 6 days and 23 hours ago, the reduction will not yet apply. This ensures predictable behavior and avoids partial-day ambiguity.
Set `auto_price_reduction.enabled: false` (or omit the entire `auto_price_reduction` section) to keep the existing behavior—prices stay fixed and `repost_count` only acts as tracked metadata for future changes.
You can configure `auto_price_reduction` once under `ad_defaults` in `config.yaml`. The `min_price` can be set there or overridden per ad file as needed.
### Special Attributes
Special attributes are category-specific key/value pairs. Use the download command to inspect existing ads in your category and reuse the keys you see under `special_attributes`.
```yaml
special_attributes:
# Example for rental properties
# haus_mieten.zimmer_d: "3" # Number of rooms
```
### Shipping Configuration
```yaml
shipping_type: # one of: PICKUP, SHIPPING, NOT_APPLICABLE (default: SHIPPING)
shipping_costs: # e.g. 2.95 (for individual postage, keep shipping_type SHIPPING and leave shipping_options empty)
# Specify shipping options / packages
# It is possible to select multiple packages, but only from one size (S, M, L)!
# Possible package types for size S:
# - DHL_2
# - Hermes_Päckchen
# - Hermes_S
# Possible package types for size M:
# - DHL_5
# - Hermes_M
# Possible package types for size L:
# - DHL_10
# - DHL_20
# - DHL_31,5
# - Hermes_L
shipping_options: []
# Example (size S only):
# shipping_options:
# - DHL_2
# - Hermes_Päckchen
sell_directly: # true or false, requires shipping_type SHIPPING to take effect (default: false)
```
**Shipping types:**
- `PICKUP` - Buyer picks up the item
- `SHIPPING` - Item is shipped (requires shipping costs or options)
- `NOT_APPLICABLE` - Shipping not applicable for this item
**Sell Directly:**
When `sell_directly: true`, buyers can purchase the item directly through the platform without contacting the seller first. This feature only works when `shipping_type: SHIPPING`.
### Images
List of wildcard patterns to select images. If relative paths are specified, they are relative to this ad configuration file.
```yaml
images:
# - laptop_*.{jpg,png}
```
### Contact Information
Contact details for the ad. These override defaults from `config.yaml`.
```yaml
contact:
name:
street:
zipcode:
phone: "" # IMPORTANT: surround phone number with quotes to prevent removal of leading zeros
```
### Republication Interval
How often the ad should be republished (in days). Overrides `ad_defaults.republication_interval` from `config.yaml`.
```yaml
republication_interval: # every X days the ad should be re-published (default: 7)
```
### Auto-Managed Fields
The following fields are automatically managed by the bot. Do not manually edit these unless you know what you're doing.
```yaml
id: # The ID assigned by kleinanzeigen.de
created_on: # ISO timestamp when the ad was first published
updated_on: # ISO timestamp when the ad was last published
content_hash: # Hash of the ad content, used to detect changes
repost_count: # How often the ad has been (re)published; used for automatic price reductions
```
## Complete Example
```yaml
# yaml-language-server: $schema=https://raw.githubusercontent.com/Second-Hand-Friends/kleinanzeigen-bot/refs/heads/main/schemas/ad.schema.json
active: true
type: OFFER
title: "Example Ad Title"
description: |
This is a multi-line description.
You can add as much detail as you want here.
The bot will preserve line breaks and formatting.
description_prefix: "For sale: " # Optional ad-level override; defaults can live in config.yaml
description_suffix: " Please message if interested!" # Optional ad-level override
category: "Elektronik > Notebooks"
price: 150
price_type: FIXED
auto_price_reduction:
enabled: true
strategy: PERCENTAGE
amount: 10
min_price: 90
delay_reposts: 0
delay_days: 0
shipping_type: SHIPPING
shipping_costs: 4.95
sell_directly: true
images:
- "images/laptop_*.jpg"
contact:
name: "John Doe"
street: "Main Street 123"
zipcode: "12345"
phone: "0123456789"
republication_interval: 7
```
## Best Practices
1. **Use meaningful filenames**: Name your ad files descriptively, e.g., `ad_laptop_hp_15.yaml`
1. **Set defaults in config.yaml**: Put common values in `ad_defaults` to avoid repetition
1. **Test before bulk publishing**: Use `--ads=changed` or `--ads=new` to test changes before republishing all ads
1. **Back up your ad files**: Keep them in version control if you want to track changes
1. **Use price reductions carefully**: Set appropriate `min_price` to avoid underpricing
1. **Check shipping options**: Ensure your shipping options match the actual package size and cost
## Troubleshooting
- **Schema validation errors**: Run `kleinanzeigen-bot verify` (binary) or `pdm run app verify` (source) to see which fields fail validation.
- **Price reduction not applying**: Confirm `auto_price_reduction.enabled` is `true`, `min_price` is set, and you are using `publish` (not `update`). Remember ad-level values override `ad_defaults`.
- **Shipping configuration issues**: Use `shipping_type: SHIPPING` when setting `shipping_costs` or `shipping_options`, and pick options from a single size group (S/M/L).
- **Category not found**: Verify the category name or ID and check any custom mappings in `config.yaml`.
- **File naming/prefix mismatch**: Ensure ad files match your `ad_files` glob and prefix (default `ad_`).
- **Image path resolution**: Relative paths are resolved from the ad file location; use absolute paths and check file permissions if images are not found.

View File

@@ -8,13 +8,15 @@ This guide helps you resolve common browser connection issues with the kleinanze
Google implemented security changes in Chrome 136 that require `--user-data-dir` to be specified when using `--remote-debugging-port`. This prevents attackers from accessing the default Chrome profile and stealing cookies/credentials.
**Quick Fix:**
### Quick Fix
```bash
# Start Chrome with custom user data directory
chrome --remote-debugging-port=9222 --user-data-dir=/tmp/chrome-debug-profile
```
**In your config.yaml:**
### In your config.yaml
```yaml
browser:
arguments:
@@ -32,16 +34,19 @@ For more details, see [Chrome 136+ Security Changes](#5-chrome-136-security-chan
Run the diagnostic command to automatically check your setup:
**For binary users:**
```bash
kleinanzeigen-bot diagnose
```
**For source users:**
```bash
pdm run app diagnose
```
This will check:
- Browser binary availability and permissions
- User data directory permissions
- Remote debugging port status
@@ -52,7 +57,7 @@ This will check:
**Automatic Chrome 136+ Validation:**
The bot automatically detects Chrome/Edge 136+ and validates your configuration. If you're using Chrome 136+ with remote debugging but missing the required `--user-data-dir` setting, you'll see clear error messages like:
```
```console
Chrome 136+ configuration validation failed: Chrome 136+ requires --user-data-dir
Please update your configuration to include --user-data-dir for remote debugging
```
@@ -62,48 +67,59 @@ The bot will also provide specific instructions on how to fix your configuration
### Issue: Slow page loads or recurring TimeoutError
**Symptoms:**
- `_extract_category_from_ad_page` fails intermittently due to breadcrumb lookups timing out
- Captcha/SMS/GDPR prompts appear right after a timeout
- Requests to GitHub's API fail sporadically with timeout errors
**Solutions:**
1. Increase `timeouts.multiplier` in `config.yaml` (e.g. `2.0` doubles every timeout consistently).
2. Override specific keys under `timeouts` (e.g. `pagination_initial: 20.0`) if only a single selector is problematic.
3. Keep `retry_enabled` on so that DOM lookups are retried with exponential backoff.
1. Increase `timeouts.multiplier` in `config.yaml` (e.g., `2.0` doubles every timeout consistently).
1. Override specific keys under `timeouts` (e.g., `pagination_initial: 20.0`) if only a single selector is problematic.
1. For slow email verification prompts, raise `timeouts.email_verification`.
1. Keep `retry_enabled` on so that DOM lookups are retried with exponential backoff.
### Issue: Bot fails to detect existing login session
**Symptoms:**
- Bot re-logins despite being already authenticated
- Intermittent (50/50) login detection behavior
- More common with profiles unused for 20+ days
**How login detection works:**
The bot checks your login status using a fast server request first, with a fallback to checking page elements if needed.
The bot checks your login status using page elements first (to minimize bot-like behavior), with a fallback to a server-side request if needed.
The bot uses a **server-side auth probe** as the primary method to detect login state:
The bot uses a **DOM-based check** as the primary method to detect login state:
1. **Auth probe (preferred)**: Sends a GET request to `{root_url}/m-meine-anzeigen-verwalten.json?sort=DEFAULT`
- Returns `LOGGED_IN` if the response is HTTP 200 with valid JSON containing `"ads"` key
- Returns `LOGGED_OUT` if response is HTTP 401/403 or HTML contains login markers
- Returns `UNKNOWN` on timeouts, assertion failures, or unexpected response bodies
1. **DOM check (preferred - stealthy)**: Checks for user profile elements in the page
2. **DOM fallback**: Only used when the auth probe returns `UNKNOWN`
- Looks for `.mr-medium` element containing username
- Falls back to `#user-email` ID
- Uses the `login_detection` timeout (default: 10.0 seconds with effective timeout with retry/backoff)
- Minimizes bot detection by avoiding JSON API requests that normal users wouldn't trigger
2. **Auth probe fallback (more reliable)**: Sends a GET request to `{root_url}/m-meine-anzeigen-verwalten.json?sort=DEFAULT`
- Returns `LOGGED_IN` if the response is HTTP 200 with valid JSON containing `"ads"` key
- Returns `LOGGED_OUT` if response is HTTP 401/403 or HTML contains login markers
- Returns `UNKNOWN` on timeouts, assertion failures, or unexpected response bodies
- Only used when DOM check is inconclusive (UNKNOWN or timed out)
3. **Diagnostics capture**: If the state remains `UNKNOWN` and `diagnostics.login_detection_capture` is enabled
- Captures a screenshot and HTML dump for troubleshooting
- Pauses for manual inspection if `diagnostics.pause_on_login_detection_failure` is enabled and running in an interactive terminal
- Captures a screenshot and HTML dump for troubleshooting
- Pauses for manual inspection if `diagnostics.pause_on_login_detection_failure` is enabled and running in an interactive terminal
**What `login_detection` controls:**
- Maximum time (seconds) to wait for user profile DOM elements when checking if already logged in
- Default: `10.0` seconds (effective timeout with retry/backoff)
- Used at startup before attempting login
- Note: With the new auth probe, this timeout only applies to the DOM fallback path
- Note: With DOM-first order, this timeout applies to the primary DOM check path
**When to increase `login_detection`:**
- Frequent unnecessary re-logins despite being authenticated
- Slow or unstable network connection
- Using browser profiles that haven't been active for weeks
@@ -111,6 +127,7 @@ The bot uses a **server-side auth probe** as the primary method to detect login
> **⚠️ PII Warning:** HTML dumps captured by diagnostics may contain your account email or other personally identifiable information. Review files in the diagnostics output directory before sharing them publicly.
**Example:**
```yaml
timeouts:
login_detection: 15.0 # For slower networks or old sessions
@@ -127,18 +144,21 @@ diagnostics:
### Issue 1: "Failed to connect to browser" with "root" error
**Symptoms:**
- Error message mentions "One of the causes could be when you are running as root"
- Connection fails when using existing browser profiles
**Causes:**
1. Running the application as root user
2. Browser profile is locked or in use by another process
3. Insufficient permissions to access the browser profile
4. Browser is not properly started with remote debugging enabled
1. Browser profile is locked or in use by another process
1. Insufficient permissions to access the browser profile
1. Browser is not properly started with remote debugging enabled
**Solutions:**
#### 1. Don't run as root
```bash
# ❌ Don't do this
sudo pdm run app publish
@@ -148,6 +168,7 @@ pdm run app publish
```
#### 2. Close all browser instances
```bash
# On Linux/macOS
pkill -f chrome
@@ -160,7 +181,9 @@ taskkill /f /im msedge.exe
```
#### 3. Remove user_data_dir temporarily
Edit your `config.yaml` and comment out or remove the `user_data_dir` line:
```yaml
browser:
# user_data_dir: C:\Users\user\AppData\Local\Microsoft\Edge\User Data # Comment this out
@@ -168,6 +191,7 @@ browser:
```
#### 4. Start browser manually with remote debugging
```bash
# For Chrome (macOS)
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222 --user-data-dir=/tmp/chrome-debug-profile
@@ -189,6 +213,7 @@ chromium --remote-debugging-port=9222 --user-data-dir=/tmp/chromium-debug-profil
```
Then in your `config.yaml`:
```yaml
browser:
arguments:
@@ -197,29 +222,33 @@ browser:
user_data_dir: "/tmp/chrome-debug-profile" # Must match the argument above
```
**⚠️ IMPORTANT: Chrome 136+ Security Requirement**
#### ⚠️ IMPORTANT: Chrome 136+ Security Requirement
Starting with Chrome 136 (March 2025), Google has implemented security changes that require `--user-data-dir` to be specified when using `--remote-debugging-port`. This prevents attackers from accessing the default Chrome profile and stealing cookies/credentials. See [Chrome's security announcement](https://developer.chrome.com/blog/remote-debugging-port?hl=de) for more details.
### Issue 2: "Browser process not reachable at 127.0.0.1:9222"
**Symptoms:**
- Port check fails when trying to connect to existing browser
- Browser appears to be running but connection fails
**Causes:**
1. Browser not started with remote debugging port
2. Port is blocked by firewall
3. Browser crashed or closed
4. Timing issue - browser not fully started
5. Browser update changed remote debugging behavior
6. Existing Chrome instance conflicts with new debugging session
7. **Chrome 136+ security requirement not met** (most common cause since March 2025)
1. Port is blocked by firewall
1. Browser crashed or closed
1. Timing issue - browser not fully started
1. Browser update changed remote debugging behavior
1. Existing Chrome instance conflicts with new debugging session
1. **Chrome 136+ security requirement not met** (most common cause since March 2025)
**Solutions:**
#### 1. Verify browser is started with remote debugging
Make sure your browser is started with the correct flag:
```bash
# Check if browser is running with remote debugging
netstat -an | grep 9222 # Linux/macOS
@@ -227,6 +256,7 @@ netstat -an | findstr 9222 # Windows
```
#### 2. Start browser manually first
```bash
# Start browser with remote debugging
chrome --remote-debugging-port=9222 --user-data-dir=/tmp/chrome-debug
@@ -238,9 +268,12 @@ pdm run app publish # For source users
```
#### 3. macOS-specific: Chrome started but connection fails
If you're on macOS and Chrome is started with remote debugging but the bot still can't connect:
**⚠️ IMPORTANT: This is a Chrome/macOS security issue that requires a dedicated user data directory**
#### ⚠️ IMPORTANT: macOS Security Requirement
This is a Chrome/macOS security issue that requires a dedicated user data directory.
```bash
# Method 1: Use the full path to Chrome with dedicated user data directory
@@ -272,12 +305,14 @@ browser:
```
**Common macOS issues:**
- Chrome/macOS security restrictions require a dedicated user data directory
- The `--user-data-dir` flag is **mandatory** for remote debugging on macOS
- Use `--disable-dev-shm-usage` to avoid shared memory issues
- The user data directory must match between manual Chrome startup and config.yaml
#### 4. Browser update issues
If it worked before but stopped working after a browser update:
```bash
@@ -300,12 +335,14 @@ taskkill /f /im chrome.exe # Windows
```
**After browser updates:**
- Chrome may have changed how remote debugging works
- Security restrictions may have been updated
- Try using a fresh user data directory to avoid conflicts
- Ensure you're using the latest version of the bot
#### 5. Chrome 136+ Security Changes (March 2025)
If you're using Chrome 136 or later and remote debugging stopped working:
**The Problem:**
@@ -323,6 +360,7 @@ chrome --remote-debugging-port=9222 --user-data-dir=/tmp/chrome-debug-profile
```
**In your config.yaml:**
```yaml
browser:
arguments:
@@ -332,22 +370,27 @@ browser:
```
**Why this change was made:**
- Prevents attackers from accessing the default Chrome profile
- Protects cookies and login credentials
- Uses a different encryption key for the custom profile
- Makes debugging more secure
**For more information:**
- [Chrome's security announcement](https://developer.chrome.com/blog/remote-debugging-port?hl=de)
- [GitHub issue discussion](https://github.com/Second-Hand-Friends/kleinanzeigen-bot/issues/604)
#### 5. Check firewall settings
#### 6. Check firewall settings
- Windows: Check Windows Defender Firewall
- macOS: Check System Preferences > Security & Privacy > Firewall
- Linux: Check iptables or ufw settings
#### 6. Use different port
#### 7. Use different port
Try a different port in case 9222 is blocked:
```yaml
browser:
arguments:
@@ -357,6 +400,7 @@ browser:
### Issue 3: Profile directory issues
**Symptoms:**
- Errors about profile directory not found
- Permission denied errors
- Profile locked errors
@@ -364,6 +408,7 @@ browser:
**Solutions:**
#### 1. Use temporary profile
```yaml
browser:
user_data_dir: "/tmp/chrome-temp" # Linux/macOS
@@ -372,6 +417,7 @@ browser:
```
#### 2. Check profile permissions
```bash
# Linux/macOS
ls -la ~/.config/google-chrome/
@@ -382,6 +428,7 @@ chmod 755 ~/.config/google-chrome/
```
#### 3. Remove profile temporarily
```yaml
browser:
# user_data_dir: "" # Comment out or remove
@@ -392,16 +439,19 @@ browser:
### Issue 4: Platform-specific issues
#### Windows
- **Antivirus software**: Add browser executable to exclusions
- **Windows Defender**: Add folder to exclusions
- **UAC**: Run as administrator if needed (but not recommended)
#### macOS
- **Gatekeeper**: Allow browser in System Preferences > Security & Privacy
- **SIP**: System Integrity Protection might block some operations
- **Permissions**: Grant full disk access to terminal/IDE
#### Linux
- **Sandbox**: Add `--no-sandbox` to browser arguments
- **Root user**: Never run as root, use regular user
- **Display**: Ensure X11 or Wayland is properly configured
@@ -409,6 +459,7 @@ browser:
## Configuration Examples
### Basic working configuration
```yaml
browser:
arguments:
@@ -418,6 +469,7 @@ browser:
```
### Using existing browser
```yaml
browser:
arguments:
@@ -428,6 +480,7 @@ browser:
```
### Using existing browser on macOS (REQUIRED configuration)
```yaml
browser:
arguments:
@@ -439,6 +492,7 @@ browser:
```
### Using specific profile
```yaml
browser:
user_data_dir: "C:\\Users\\username\\AppData\\Local\\Google\\Chrome\\User Data"
@@ -450,6 +504,7 @@ browser:
## Advanced Troubleshooting
### Check browser compatibility
```bash
# Test if browser can be started manually
# macOS
@@ -467,6 +522,7 @@ msedge --version
```
### Monitor browser processes
```bash
# Linux/macOS
ps aux | grep chrome
@@ -478,6 +534,7 @@ netstat -an | findstr 9222
```
### Debug with verbose logging
```bash
kleinanzeigen-bot -v publish # For binary users
# or
@@ -485,19 +542,72 @@ pdm run app -v publish # For source users
```
### Test browser connection manually
```bash
# Test if port is accessible
curl http://localhost:9222/json/version
```
## Using an Existing Browser Window
By default a new browser process will be launched. To reuse a manually launched browser window/process, follow these steps:
1. Manually launch your browser from the command line with the `--remote-debugging-port=<NUMBER>` flag.
You are free to choose an unused port number between 1025 and 65535, for example:
- `chrome --remote-debugging-port=9222`
- `chromium --remote-debugging-port=9222`
- `msedge --remote-debugging-port=9222`
This runs the browser in debug mode which allows it to be remote controlled by the bot.
**⚠️ IMPORTANT: Chrome 136+ Security Requirement**
Starting with Chrome 136 (March 2025), Google has implemented security changes that require `--user-data-dir` to be specified when using `--remote-debugging-port`. This prevents attackers from accessing the default Chrome profile and stealing cookies/credentials.
**You must now use:**
```bash
chrome --remote-debugging-port=9222 --user-data-dir=/path/to/custom/directory
```
**And in your config.yaml:**
```yaml
browser:
arguments:
- --remote-debugging-port=9222
- --user-data-dir=/path/to/custom/directory
user_data_dir: "/path/to/custom/directory"
```
**The bot will automatically detect Chrome 136+ and validate your configuration. If validation fails, you'll see clear error messages with specific instructions on how to fix your configuration.**
1. In your config.yaml specify the same flags as browser arguments, for example:
```yaml
browser:
arguments:
- --remote-debugging-port=9222
- --user-data-dir=/tmp/chrome-debug-profile # Required for Chrome 136+
user_data_dir: "/tmp/chrome-debug-profile" # Must match the argument above
```
1. When now publishing ads the manually launched browser will be re-used.
> NOTE: If an existing browser is used all other settings configured under `browser` in your config.yaml file will be ignored
> because they are only used to programmatically configure/launch a dedicated browser instance.
>
> **Security Note:** This change was implemented by Google to protect users from cookie theft attacks. The custom user data directory uses a different encryption key than the default profile, making it more secure for debugging purposes.
## Getting Help
If you're still experiencing issues:
1. Run the diagnostic command: `kleinanzeigen-bot diagnose` (binary) or `pdm run app diagnose` (source)
2. Check the log file for detailed error messages
3. Try the solutions above step by step
4. Create an issue on GitHub with:
1. Check the log file for detailed error messages
1. Try the solutions above step by step
1. Create an issue on GitHub with:
- Output from the diagnose command
- Your `config.yaml` (remove sensitive information)
- Error messages from the log file
@@ -508,8 +618,8 @@ If you're still experiencing issues:
To avoid browser connection issues:
1. **Don't run as root** - Always use a regular user account
2. **Close other browser instances** - Ensure no other browser processes are running
3. **Use temporary profiles** - Avoid conflicts with existing browser sessions
4. **Keep browser updated** - Use the latest stable version
5. **Check permissions** - Ensure proper file and folder permissions
6. **Monitor system resources** - Ensure sufficient memory and disk space
1. **Close other browser instances** - Ensure no other browser processes are running
1. **Use temporary profiles** - Avoid conflicts with existing browser sessions
1. **Keep browser updated** - Use the latest stable version
1. **Check permissions** - Ensure proper file and folder permissions
1. **Monitor system resources** - Ensure sufficient memory and disk space

296
docs/CONFIGURATION.md Normal file
View File

@@ -0,0 +1,296 @@
# Configuration Reference
Complete reference for `config.yaml`, the main configuration file for kleinanzeigen-bot.
## Quick Start
To generate a default configuration file with all current defaults:
```bash
kleinanzeigen-bot create-config
```
For full JSON schema with IDE autocompletion support, see:
- [schemas/config.schema.json](../schemas/config.schema.json)
To enable IDE autocompletion in `config.yaml`, add this at the top of the file:
```yaml
# yaml-language-server: $schema=schemas/config.schema.json
```
For ad files, use the ad schema instead:
```yaml
# yaml-language-server: $schema=schemas/ad.schema.json
```
## File Location
The bot looks for `config.yaml` in the current directory by default. You can specify a different location using the `--config` command line option:
```bash
kleinanzeigen-bot --config /path/to/config.yaml publish
```
Valid file extensions: `.json`, `.yaml`, `.yml`
## Configuration Structure
### ad_files
Glob (wildcard) patterns to select ad configuration files. If relative paths are specified, they are relative to this configuration file.
```yaml
ad_files:
- "./**/ad_*.{json,yml,yaml}"
```
### ad_defaults
Default values for ads that can be overridden in each ad configuration file.
```yaml
ad_defaults:
active: true
type: OFFER # one of: OFFER, WANTED
description_prefix: ""
description_suffix: ""
price_type: NEGOTIABLE # one of: FIXED, NEGOTIABLE, GIVE_AWAY, NOT_APPLICABLE
shipping_type: SHIPPING # one of: PICKUP, SHIPPING, NOT_APPLICABLE
# NOTE: shipping_costs and shipping_options must be configured per-ad, not as defaults
sell_directly: false # requires shipping_type SHIPPING to take effect
contact:
name: ""
street: ""
zipcode: ""
phone: "" # IMPORTANT: surround phone number with quotes to prevent removal of leading zeros
republication_interval: 7 # every X days ads should be re-published
```
> **Tip:** For current defaults of all timeout and diagnostic settings, run `kleinanzeigen-bot create-config` or see the [JSON schema](../schemas/config.schema.json).
### categories
Additional name to category ID mappings. See the default list at:
[https://github.com/Second-Hand-Friends/kleinanzeigen-bot/blob/main/src/kleinanzeigen_bot/resources/categories.yaml](https://github.com/Second-Hand-Friends/kleinanzeigen-bot/blob/main/src/kleinanzeigen_bot/resources/categories.yaml)
```yaml
categories:
Verschenken & Tauschen > Tauschen: 272/273
Verschenken & Tauschen > Verleihen: 272/274
Verschenken & Tauschen > Verschenken: 272/192
```
### timeouts
Timeout tuning for various browser operations. Adjust these if you experience slow page loads or recurring timeouts.
```yaml
timeouts:
multiplier: 1.0 # Scale all timeouts (e.g. 2.0 for slower networks)
default: 5.0 # Base timeout for web_find/web_click/etc.
page_load: 15.0 # Timeout for web_open page loads
captcha_detection: 2.0 # Timeout for captcha iframe detection
sms_verification: 4.0 # Timeout for SMS verification banners
email_verification: 4.0 # Timeout for email verification prompts
gdpr_prompt: 10.0 # Timeout when handling GDPR dialogs
login_detection: 10.0 # Timeout for DOM-based login detection (primary method)
publishing_result: 300.0 # Timeout for publishing status checks
publishing_confirmation: 20.0 # Timeout for publish confirmation redirect
image_upload: 30.0 # Timeout for image upload and server-side processing
pagination_initial: 10.0 # Timeout for first pagination lookup
pagination_follow_up: 5.0 # Timeout for subsequent pagination clicks
quick_dom: 2.0 # Generic short DOM timeout (shipping dialogs, etc.)
update_check: 10.0 # Timeout for GitHub update requests
chrome_remote_probe: 2.0 # Timeout for local remote-debugging probes
chrome_remote_debugging: 5.0 # Timeout for remote debugging API calls
chrome_binary_detection: 10.0 # Timeout for chrome --version subprocess
retry_enabled: true # Enables DOM retry/backoff when timeouts occur
retry_max_attempts: 2
retry_backoff_factor: 1.5
```
**Timeout tuning tips:**
- Slow networks or sluggish remote browsers often just need a higher `timeouts.multiplier`
- For truly problematic selectors, override specific keys directly under `timeouts`
- Keep `retry_enabled` on so DOM lookups are retried with exponential backoff
For more details on timeout configuration and troubleshooting, see [Browser Troubleshooting](./BROWSER_TROUBLESHOOTING.md).
### download
Download configuration for the `download` command.
```yaml
download:
include_all_matching_shipping_options: false # if true, all shipping options matching the package size will be included
excluded_shipping_options: [] # list of shipping options to exclude, e.g. ['DHL_2', 'DHL_5']
folder_name_max_length: 100 # maximum length for folder names when downloading ads (default: 100)
rename_existing_folders: false # if true, rename existing folders without titles to include titles (default: false)
```
### publishing
Publishing configuration.
```yaml
publishing:
delete_old_ads: "AFTER_PUBLISH" # one of: AFTER_PUBLISH, BEFORE_PUBLISH, NEVER
delete_old_ads_by_title: true # only works if delete_old_ads is set to BEFORE_PUBLISH
```
### captcha
Captcha handling configuration. Enable automatic restart to avoid manual confirmation after captchas.
```yaml
captcha:
auto_restart: true # If true, the bot aborts when a Captcha appears and retries publishing later
# If false (default), the Captcha must be solved manually to continue
restart_delay: 1h 30m # Time to wait before retrying after a Captcha was encountered (default: 6h)
```
### browser
Browser configuration. These settings control how the bot launches and connects to Chromium-based browsers.
```yaml
browser:
# See: https://peter.sh/experiments/chromium-command-line-switches/
arguments:
# Example arguments
- --disable-dev-shm-usage
- --no-sandbox
# --headless
# --start-maximized
binary_location: # path to custom browser executable, if not specified will be looked up on PATH
extensions: [] # a list of .crx extension files to be loaded
use_private_window: true
user_data_dir: "" # see https://github.com/chromium/chromium/blob/main/docs/user_data_dir.md
profile_name: ""
```
**Common browser arguments:**
- `--disable-dev-shm-usage` - Avoids shared memory issues in Docker environments
- `--no-sandbox` - Required when running as root (not recommended)
- `--headless` - Run browser in headless mode (no GUI)
- `--start-maximized` - Start browser maximized
For detailed browser connection troubleshooting, including Chrome 136+ security requirements and remote debugging setup, see [Browser Troubleshooting](./BROWSER_TROUBLESHOOTING.md).
### update_check
Update check configuration to automatically check for newer versions on GitHub.
```yaml
update_check:
enabled: true # Enable/disable update checks
channel: latest # One of: latest, preview
interval: 7d # Check interval (e.g. 7d for 7 days)
```
**Interval format:**
- `s`: seconds, `m`: minutes, `h`: hours, `d`: days
- Examples: `7d` (7 days), `12h` (12 hours), `30d` (30 days)
- Validation: minimum 1 day, maximum 30 days
**Channels:**
- `latest`: Only final releases
- `preview`: Includes pre-releases
### login
Login credentials.
```yaml
login:
username: ""
password: ""
```
> **Security Note:** Never commit your credentials to version control. Keep your `config.yaml` secure and exclude it from git if it contains sensitive information.
### diagnostics
Diagnostics configuration for troubleshooting login detection issues.
```yaml
diagnostics:
login_detection_capture: false # Capture screenshot + HTML when login state is UNKNOWN
pause_on_login_detection_failure: false # Pause for manual inspection (interactive only)
output_dir: "" # Custom output directory (default: portable .temp/diagnostics, xdg cache/diagnostics)
```
**Login Detection Behavior:**
The bot uses a layered approach to detect login state, prioritizing stealth over reliability:
1. **DOM check (primary method - preferred for stealth)**: Checks for user profile elements
- Looks for `.mr-medium` element containing username
- Falls back to `#user-email` ID
- Uses `login_detection` timeout (default: 10.0 seconds)
- Minimizes bot-like behavior by avoiding JSON API requests
2. **Auth probe fallback (more reliable)**: Sends a GET request to `{root_url}/m-meine-anzeigen-verwalten.json?sort=DEFAULT`
- Returns `LOGGED_IN` if response is HTTP 200 with valid JSON containing `"ads"` key
- Returns `LOGGED_OUT` if response is HTTP 401/403 or HTML contains login markers
- Returns `UNKNOWN` on timeouts, assertion failures, or unexpected response bodies
- Only used when DOM check is inconclusive (UNKNOWN or timed out)
**Optional diagnostics:**
- Enable `login_detection_capture` to capture screenshots and HTML dumps when state is `UNKNOWN`
- Enable `pause_on_login_detection_failure` to pause the bot for manual inspection (interactive sessions only; requires `login_detection_capture=true`)
- Use custom `output_dir` to specify where artifacts are saved
**Output locations (default):**
- **Portable mode**: `./.temp/diagnostics/`
- **System-wide mode (XDG)**: `~/.cache/kleinanzeigen-bot/diagnostics/` (Linux) or `~/Library/Caches/kleinanzeigen-bot/diagnostics/` (macOS)
- **Custom**: Path resolved relative to your `config.yaml` if `output_dir` is specified
> **⚠️ PII Warning:** HTML dumps may contain your account email or other personally identifiable information. Review files in the diagnostics output directory before sharing them publicly.
## Installation Modes
On first run, the app may ask which installation mode to use.
1. **Portable mode (recommended for most users, especially on Windows):**
- Stores config, logs, downloads, and state in the current directory
- No admin permissions required
- Easy backup/migration; works from USB drives
2. **System-wide mode (advanced users / multi-user setups):**
- Stores files in OS-standard locations
- Cleaner directory structure; better separation from working directory
- Requires proper permissions for user data directories
**OS notes:**
- **Windows:** System-wide uses AppData (Roaming/Local); portable keeps everything beside the `.exe`.
- **Linux:** System-wide follows XDG Base Directory spec; portable stays in the current working directory.
- **macOS:** System-wide uses `~/Library/Application Support/kleinanzeigen-bot` (and related dirs); portable stays in the current directory.
## Getting Current Defaults
To see all current default values, run:
```bash
kleinanzeigen-bot create-config
```
This generates a config file with `exclude_none=True`, giving you all the non-None defaults.
For the complete machine-readable reference, see the [JSON schema](../schemas/config.schema.json).

33
docs/INDEX.md Normal file
View File

@@ -0,0 +1,33 @@
# Documentation Index
This directory contains detailed documentation for kleinanzeigen-bot users and contributors.
## User Documentation
- [Configuration](./CONFIGURATION.md) - Complete reference for `config.yaml`, including all configuration options, timeouts, browser settings, and update check configuration.
- [Ad Configuration](./AD_CONFIGURATION.md) - Complete reference for ad YAML files, including automatic price reduction, description prefix/suffix, and shipping options.
- [Browser Troubleshooting](./BROWSER_TROUBLESHOOTING.md) - Troubleshooting guide for browser connection issues, including Chrome 136+ security requirements, remote debugging setup, and common solutions.
## Contributor Documentation
Contributor documentation is located in the main repository:
- [CONTRIBUTING.md](../CONTRIBUTING.md) - Development setup, workflow, code quality standards, testing requirements, and contribution guidelines.
- [TESTING.md](./TESTING.md) - Detailed testing strategy, test types (unit/integration/smoke), and execution instructions for contributors.
## Getting Started
New users should start with the [README](../README.md), then refer to these documents for detailed configuration and troubleshooting information.
### Quick Start (3 steps)
1. Install and run the app from the [README](../README.md).
2. Generate `config.yaml` with `kleinanzeigen-bot create-config` and review defaults in [Configuration](./CONFIGURATION.md).
3. Verify your setup with `kleinanzeigen-bot verify`, then publish with `kleinanzeigen-bot publish`.
### Common Troubleshooting Tips
- Browser connection issues: confirm remote debugging settings and Chrome 136+ requirements in [Browser Troubleshooting](./BROWSER_TROUBLESHOOTING.md).

View File

@@ -39,14 +39,15 @@ This project uses a layered testing approach, with a focus on reliability and fa
- All smoke tests **must** be marked with `@pytest.mark.smoke`.
- Place smoke tests in `tests/smoke/` for discoverability.
- Example:
```python
import pytest
@pytest.mark.smoke
@pytest.mark.asyncio
async def test_bot_starts(smoke_bot):
...
```
```python
import pytest
@pytest.mark.smoke
@pytest.mark.asyncio
async def test_bot_starts(smoke_bot):
...
```
### Running Smoke, Unit, and Integration Tests

View File

@@ -1,95 +0,0 @@
# Update Check Feature
## Overview
The update check feature automatically checks for newer versions of the bot on GitHub. It supports two channels:
- `latest`: Only final releases
- `prerelease`: Includes pre-releases
## Configuration
```yaml
update_check:
enabled: true # Enable/disable update checks
channel: latest # One of: latest, prerelease
interval: 7d # Check interval (e.g. 7d for 7 days)
```
### Interval Format
The interval is specified as a number followed by a unit:
- `s`: seconds
- `m`: minutes
- `h`: hours
- `d`: days
Examples:
- `7d`: Check every 7 days
- `12h`: Check every 12 hours
- `30d`: Check every 30 days
Validation rules:
- Minimum interval: 1 day (`1d`)
- Maximum interval: 30 days (`30d`, roughly 4 weeks)
- Value must be positive
- Only supported units are allowed
## State File
The update check state is stored in `.temp/update_check_state.json`. The file format is:
```json
{
"version": 1,
"last_check": "2024-03-20T12:00:00+00:00"
}
```
### Fields
- `version`: Current state file format version (integer)
- `last_check`: ISO 8601 timestamp of the last check (UTC)
### Migration
The state file supports version migration:
- Version 0 to 1: Added version field
- Future versions will be migrated automatically
### Timezone Handling
All timestamps are stored in UTC:
- When loading:
- Timestamps without timezone are assumed to be UTC
- Timestamps with timezone are converted to UTC
- When saving:
- All timestamps are converted to UTC before saving
- Timezone information is preserved in ISO 8601 format
### Edge Cases
The following edge cases are handled:
- Missing state file: Creates new state file
- Corrupted state file: Creates new state file
- Invalid timestamp format: Logs warning, uses current time
- Permission errors: Logs warning, continues without saving
- Invalid interval format: Logs warning, performs check
- Interval too short/long: Logs warning, performs check
## Error Handling
The update check feature handles various error scenarios:
- Network errors: Logs error, continues without check
- GitHub API errors: Logs error, continues without check
- Version parsing errors: Logs error, continues without check
- State file errors: Logs error, creates new state file
- Permission errors: Logs error, continues without saving
## Logging
The feature logs various events:
- Check results (new version available, up to date, etc.)
- State file operations (load, save, migration)
- Error conditions (network, API, parsing, etc.)
- Interval validation warnings
- Timezone conversion information