mirror of
https://github.com/Second-Hand-Friends/kleinanzeigen-bot.git
synced 2026-03-12 02:31:45 +01:00
## ℹ️ Description Refactors and reorganizes documentation to improve navigation and keep the README concise. - Link to the related issue(s): Issue #N/A - Describe the motivation and context for this change. - The README had grown long and duplicated detailed config/ad references; this consolidates docs into focused guides and adds an index. ## 📋 Changes Summary - Add dedicated docs pages for configuration, ad configuration, update checks, and a docs index. - Slim README and CONTRIBUTING to reference dedicated guides and clean up formatting/markdownlint issues. - Refresh browser troubleshooting and update-check guidance; keep the update channel name aligned with schema/implementation. - Add markdownlint configuration for consistent docs formatting. ### ⚙️ Type of Change Select the type(s) of change(s) included in this pull request: - [ ] 🐞 Bug fix (non-breaking change which fixes an issue) - [x] ✨ New feature (adds new functionality without breaking existing usage) - [ ] 💥 Breaking change (changes that might break existing user setups, scripts, or configurations) ## ✅ Checklist Before requesting a review, confirm the following: - [x] I have reviewed my changes to ensure they meet the project's standards. - [x] I have tested my changes and ensured that all tests pass (`pdm run test`). - [x] I have formatted the code (`pdm run format`). - [x] I have verified that linting passes (`pdm run lint`). - [x] I have updated documentation where necessary. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Documentation** * Reorganized and enhanced contributing guidelines with improved structure and formatting * Streamlined README with better organization and updated installation instructions * Added comprehensive configuration reference documentation for configuration and ad settings * Improved browser troubleshooting guide with updated guidance, examples, and diagnostic information * Created new documentation index for easier navigation <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai -->
297 lines
11 KiB
Markdown
297 lines
11 KiB
Markdown
# Configuration Reference
|
|
|
|
Complete reference for `config.yaml`, the main configuration file for kleinanzeigen-bot.
|
|
|
|
## Quick Start
|
|
|
|
To generate a default configuration file with all current defaults:
|
|
|
|
```bash
|
|
kleinanzeigen-bot create-config
|
|
```
|
|
|
|
For full JSON schema with IDE autocompletion support, see:
|
|
|
|
- [schemas/config.schema.json](../schemas/config.schema.json)
|
|
|
|
To enable IDE autocompletion in `config.yaml`, add this at the top of the file:
|
|
|
|
```yaml
|
|
# yaml-language-server: $schema=schemas/config.schema.json
|
|
```
|
|
|
|
For ad files, use the ad schema instead:
|
|
|
|
```yaml
|
|
# yaml-language-server: $schema=schemas/ad.schema.json
|
|
```
|
|
|
|
## File Location
|
|
|
|
The bot looks for `config.yaml` in the current directory by default. You can specify a different location using the `--config` command line option:
|
|
|
|
```bash
|
|
kleinanzeigen-bot --config /path/to/config.yaml publish
|
|
```
|
|
|
|
Valid file extensions: `.json`, `.yaml`, `.yml`
|
|
|
|
## Configuration Structure
|
|
|
|
### ad_files
|
|
|
|
Glob (wildcard) patterns to select ad configuration files. If relative paths are specified, they are relative to this configuration file.
|
|
|
|
```yaml
|
|
ad_files:
|
|
- "./**/ad_*.{json,yml,yaml}"
|
|
```
|
|
|
|
### ad_defaults
|
|
|
|
Default values for ads that can be overridden in each ad configuration file.
|
|
|
|
```yaml
|
|
ad_defaults:
|
|
active: true
|
|
type: OFFER # one of: OFFER, WANTED
|
|
|
|
description_prefix: ""
|
|
description_suffix: ""
|
|
|
|
price_type: NEGOTIABLE # one of: FIXED, NEGOTIABLE, GIVE_AWAY, NOT_APPLICABLE
|
|
shipping_type: SHIPPING # one of: PICKUP, SHIPPING, NOT_APPLICABLE
|
|
# NOTE: shipping_costs and shipping_options must be configured per-ad, not as defaults
|
|
sell_directly: false # requires shipping_type SHIPPING to take effect
|
|
contact:
|
|
name: ""
|
|
street: ""
|
|
zipcode: ""
|
|
phone: "" # IMPORTANT: surround phone number with quotes to prevent removal of leading zeros
|
|
republication_interval: 7 # every X days ads should be re-published
|
|
```
|
|
|
|
> **Tip:** For current defaults of all timeout and diagnostic settings, run `kleinanzeigen-bot create-config` or see the [JSON schema](../schemas/config.schema.json).
|
|
|
|
### categories
|
|
|
|
Additional name to category ID mappings. See the default list at:
|
|
[https://github.com/Second-Hand-Friends/kleinanzeigen-bot/blob/main/src/kleinanzeigen_bot/resources/categories.yaml](https://github.com/Second-Hand-Friends/kleinanzeigen-bot/blob/main/src/kleinanzeigen_bot/resources/categories.yaml)
|
|
|
|
```yaml
|
|
categories:
|
|
Verschenken & Tauschen > Tauschen: 272/273
|
|
Verschenken & Tauschen > Verleihen: 272/274
|
|
Verschenken & Tauschen > Verschenken: 272/192
|
|
```
|
|
|
|
### timeouts
|
|
|
|
Timeout tuning for various browser operations. Adjust these if you experience slow page loads or recurring timeouts.
|
|
|
|
```yaml
|
|
timeouts:
|
|
multiplier: 1.0 # Scale all timeouts (e.g. 2.0 for slower networks)
|
|
default: 5.0 # Base timeout for web_find/web_click/etc.
|
|
page_load: 15.0 # Timeout for web_open page loads
|
|
captcha_detection: 2.0 # Timeout for captcha iframe detection
|
|
sms_verification: 4.0 # Timeout for SMS verification banners
|
|
email_verification: 4.0 # Timeout for email verification prompts
|
|
gdpr_prompt: 10.0 # Timeout when handling GDPR dialogs
|
|
login_detection: 10.0 # Timeout for DOM-based login detection (primary method)
|
|
publishing_result: 300.0 # Timeout for publishing status checks
|
|
publishing_confirmation: 20.0 # Timeout for publish confirmation redirect
|
|
image_upload: 30.0 # Timeout for image upload and server-side processing
|
|
pagination_initial: 10.0 # Timeout for first pagination lookup
|
|
pagination_follow_up: 5.0 # Timeout for subsequent pagination clicks
|
|
quick_dom: 2.0 # Generic short DOM timeout (shipping dialogs, etc.)
|
|
update_check: 10.0 # Timeout for GitHub update requests
|
|
chrome_remote_probe: 2.0 # Timeout for local remote-debugging probes
|
|
chrome_remote_debugging: 5.0 # Timeout for remote debugging API calls
|
|
chrome_binary_detection: 10.0 # Timeout for chrome --version subprocess
|
|
retry_enabled: true # Enables DOM retry/backoff when timeouts occur
|
|
retry_max_attempts: 2
|
|
retry_backoff_factor: 1.5
|
|
```
|
|
|
|
**Timeout tuning tips:**
|
|
|
|
- Slow networks or sluggish remote browsers often just need a higher `timeouts.multiplier`
|
|
- For truly problematic selectors, override specific keys directly under `timeouts`
|
|
- Keep `retry_enabled` on so DOM lookups are retried with exponential backoff
|
|
|
|
For more details on timeout configuration and troubleshooting, see [Browser Troubleshooting](./BROWSER_TROUBLESHOOTING.md).
|
|
|
|
### download
|
|
|
|
Download configuration for the `download` command.
|
|
|
|
```yaml
|
|
download:
|
|
include_all_matching_shipping_options: false # if true, all shipping options matching the package size will be included
|
|
excluded_shipping_options: [] # list of shipping options to exclude, e.g. ['DHL_2', 'DHL_5']
|
|
folder_name_max_length: 100 # maximum length for folder names when downloading ads (default: 100)
|
|
rename_existing_folders: false # if true, rename existing folders without titles to include titles (default: false)
|
|
```
|
|
|
|
### publishing
|
|
|
|
Publishing configuration.
|
|
|
|
```yaml
|
|
publishing:
|
|
delete_old_ads: "AFTER_PUBLISH" # one of: AFTER_PUBLISH, BEFORE_PUBLISH, NEVER
|
|
delete_old_ads_by_title: true # only works if delete_old_ads is set to BEFORE_PUBLISH
|
|
```
|
|
|
|
### captcha
|
|
|
|
Captcha handling configuration. Enable automatic restart to avoid manual confirmation after captchas.
|
|
|
|
```yaml
|
|
captcha:
|
|
auto_restart: true # If true, the bot aborts when a Captcha appears and retries publishing later
|
|
# If false (default), the Captcha must be solved manually to continue
|
|
restart_delay: 1h 30m # Time to wait before retrying after a Captcha was encountered (default: 6h)
|
|
```
|
|
|
|
### browser
|
|
|
|
Browser configuration. These settings control how the bot launches and connects to Chromium-based browsers.
|
|
|
|
```yaml
|
|
browser:
|
|
# See: https://peter.sh/experiments/chromium-command-line-switches/
|
|
arguments:
|
|
# Example arguments
|
|
- --disable-dev-shm-usage
|
|
- --no-sandbox
|
|
# --headless
|
|
# --start-maximized
|
|
binary_location: # path to custom browser executable, if not specified will be looked up on PATH
|
|
extensions: [] # a list of .crx extension files to be loaded
|
|
use_private_window: true
|
|
user_data_dir: "" # see https://github.com/chromium/chromium/blob/main/docs/user_data_dir.md
|
|
profile_name: ""
|
|
```
|
|
|
|
**Common browser arguments:**
|
|
|
|
- `--disable-dev-shm-usage` - Avoids shared memory issues in Docker environments
|
|
- `--no-sandbox` - Required when running as root (not recommended)
|
|
- `--headless` - Run browser in headless mode (no GUI)
|
|
- `--start-maximized` - Start browser maximized
|
|
|
|
For detailed browser connection troubleshooting, including Chrome 136+ security requirements and remote debugging setup, see [Browser Troubleshooting](./BROWSER_TROUBLESHOOTING.md).
|
|
|
|
### update_check
|
|
|
|
Update check configuration to automatically check for newer versions on GitHub.
|
|
|
|
```yaml
|
|
update_check:
|
|
enabled: true # Enable/disable update checks
|
|
channel: latest # One of: latest, preview
|
|
interval: 7d # Check interval (e.g. 7d for 7 days)
|
|
```
|
|
|
|
**Interval format:**
|
|
|
|
- `s`: seconds, `m`: minutes, `h`: hours, `d`: days
|
|
- Examples: `7d` (7 days), `12h` (12 hours), `30d` (30 days)
|
|
- Validation: minimum 1 day, maximum 30 days
|
|
|
|
**Channels:**
|
|
|
|
- `latest`: Only final releases
|
|
- `preview`: Includes pre-releases
|
|
|
|
### login
|
|
|
|
Login credentials.
|
|
|
|
```yaml
|
|
login:
|
|
username: ""
|
|
password: ""
|
|
```
|
|
|
|
> **Security Note:** Never commit your credentials to version control. Keep your `config.yaml` secure and exclude it from git if it contains sensitive information.
|
|
|
|
### diagnostics
|
|
|
|
Diagnostics configuration for troubleshooting login detection issues.
|
|
|
|
```yaml
|
|
diagnostics:
|
|
login_detection_capture: false # Capture screenshot + HTML when login state is UNKNOWN
|
|
pause_on_login_detection_failure: false # Pause for manual inspection (interactive only)
|
|
output_dir: "" # Custom output directory (default: portable .temp/diagnostics, xdg cache/diagnostics)
|
|
```
|
|
|
|
**Login Detection Behavior:**
|
|
|
|
The bot uses a layered approach to detect login state, prioritizing stealth over reliability:
|
|
|
|
1. **DOM check (primary method - preferred for stealth)**: Checks for user profile elements
|
|
|
|
- Looks for `.mr-medium` element containing username
|
|
- Falls back to `#user-email` ID
|
|
- Uses `login_detection` timeout (default: 10.0 seconds)
|
|
- Minimizes bot-like behavior by avoiding JSON API requests
|
|
|
|
2. **Auth probe fallback (more reliable)**: Sends a GET request to `{root_url}/m-meine-anzeigen-verwalten.json?sort=DEFAULT`
|
|
|
|
- Returns `LOGGED_IN` if response is HTTP 200 with valid JSON containing `"ads"` key
|
|
- Returns `LOGGED_OUT` if response is HTTP 401/403 or HTML contains login markers
|
|
- Returns `UNKNOWN` on timeouts, assertion failures, or unexpected response bodies
|
|
- Only used when DOM check is inconclusive (UNKNOWN or timed out)
|
|
|
|
**Optional diagnostics:**
|
|
|
|
- Enable `login_detection_capture` to capture screenshots and HTML dumps when state is `UNKNOWN`
|
|
- Enable `pause_on_login_detection_failure` to pause the bot for manual inspection (interactive sessions only; requires `login_detection_capture=true`)
|
|
- Use custom `output_dir` to specify where artifacts are saved
|
|
|
|
**Output locations (default):**
|
|
|
|
- **Portable mode**: `./.temp/diagnostics/`
|
|
- **System-wide mode (XDG)**: `~/.cache/kleinanzeigen-bot/diagnostics/` (Linux) or `~/Library/Caches/kleinanzeigen-bot/diagnostics/` (macOS)
|
|
- **Custom**: Path resolved relative to your `config.yaml` if `output_dir` is specified
|
|
|
|
> **⚠️ PII Warning:** HTML dumps may contain your account email or other personally identifiable information. Review files in the diagnostics output directory before sharing them publicly.
|
|
|
|
## Installation Modes
|
|
|
|
On first run, the app may ask which installation mode to use.
|
|
|
|
1. **Portable mode (recommended for most users, especially on Windows):**
|
|
|
|
- Stores config, logs, downloads, and state in the current directory
|
|
- No admin permissions required
|
|
- Easy backup/migration; works from USB drives
|
|
|
|
2. **System-wide mode (advanced users / multi-user setups):**
|
|
|
|
- Stores files in OS-standard locations
|
|
- Cleaner directory structure; better separation from working directory
|
|
- Requires proper permissions for user data directories
|
|
|
|
**OS notes:**
|
|
|
|
- **Windows:** System-wide uses AppData (Roaming/Local); portable keeps everything beside the `.exe`.
|
|
- **Linux:** System-wide follows XDG Base Directory spec; portable stays in the current working directory.
|
|
- **macOS:** System-wide uses `~/Library/Application Support/kleinanzeigen-bot` (and related dirs); portable stays in the current directory.
|
|
|
|
## Getting Current Defaults
|
|
|
|
To see all current default values, run:
|
|
|
|
```bash
|
|
kleinanzeigen-bot create-config
|
|
```
|
|
|
|
This generates a config file with `exclude_none=True`, giving you all the non-None defaults.
|
|
|
|
For the complete machine-readable reference, see the [JSON schema](../schemas/config.schema.json).
|