docs: refactor guides for clearer navigation (#795)

## ℹ️ Description
Refactors and reorganizes documentation to improve navigation and keep
the README concise.

- Link to the related issue(s): Issue #N/A
- Describe the motivation and context for this change.
- The README had grown long and duplicated detailed config/ad
references; this consolidates docs into focused guides and adds an
index.

## 📋 Changes Summary
- Add dedicated docs pages for configuration, ad configuration, update
checks, and a docs index.
- Slim README and CONTRIBUTING to reference dedicated guides and clean
up formatting/markdownlint issues.
- Refresh browser troubleshooting and update-check guidance; keep the
update channel name aligned with schema/implementation.
- Add markdownlint configuration for consistent docs formatting.

### ⚙️ Type of Change
Select the type(s) of change(s) included in this pull request:
- [ ] 🐞 Bug fix (non-breaking change which fixes an issue)
- [x]  New feature (adds new functionality without breaking existing
usage)
- [ ] 💥 Breaking change (changes that might break existing user setups,
scripts, or configurations)


##  Checklist
Before requesting a review, confirm the following:
- [x] I have reviewed my changes to ensure they meet the project's
standards.
- [x] I have tested my changes and ensured that all tests pass (`pdm run
test`).
- [x] I have formatted the code (`pdm run format`).
- [x] I have verified that linting passes (`pdm run lint`).
- [x] I have updated documentation where necessary.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Documentation**
* Reorganized and enhanced contributing guidelines with improved
structure and formatting
* Streamlined README with better organization and updated installation
instructions
* Added comprehensive configuration reference documentation for
configuration and ad settings
* Improved browser troubleshooting guide with updated guidance,
examples, and diagnostic information
  * Created new documentation index for easier navigation

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
This commit is contained in:
Jens
2026-01-30 11:06:36 +01:00
committed by GitHub
parent 3dc24e1df7
commit a4946ba104
9 changed files with 954 additions and 561 deletions

11
.markdownlint-cli2.jsonc Normal file
View File

@@ -0,0 +1,11 @@
{
"$schema": "https://raw.githubusercontent.com/DavidAnson/markdownlint-cli2/main/schema/markdownlint-cli2-config-schema.json",
"config": {
"MD013": false,
"MD033": false
},
"ignores": [
"CODE_OF_CONDUCT.md",
"data/"
]
}

View File

@@ -1,4 +1,12 @@
# Table of Contents
# Contributing
Thanks for your interest in contributing to this project! Whether it's a bug report, new feature, correction, or additional documentation, we greatly value feedback and contributions from our community.
We want to make contributing as easy and transparent as possible. Contributions via [pull requests](#pull-request-requirements) are much appreciated.
Please read through this document before submitting any contributions to ensure your contribution goes to the correct code repository and we have all the necessary information to effectively respond to your request.
## Table of Contents
- [Development Setup](#development-setup)
- [Development Notes](#development-notes)
@@ -13,29 +21,23 @@
- [Licensing](#licensing)
- [Internationalization (i18n) and Translations](#internationalization-i18n-and-translations)
# Contributing
Thanks for your interest in contributing to this project! Whether it's a bug report, new feature, correction, or additional documentation, we greatly value feedback and contributions from our community.
We want to make contributing as easy and transparent as possible. Contributions via [pull requests](#pull-request-requirements) are much appreciated.
Please read through this document before submitting any contributions to ensure your contribution goes to the correct code repository and we have all the necessary information to effectively respond to your request.
## Development Setup
### Prerequisites
- Python 3.10 or higher
- PDM for dependency management
- Git
### Local Setup
1. Fork and clone the repository
2. Install dependencies: `pdm install`
3. Run tests to verify setup: `pdm run test:cov`
1. Install dependencies: `pdm install`
1. Run tests to verify setup: `pdm run test:cov`
## Development Notes
This section provides quick reference commands for common development tasks. See Testing Requirements below for more details on running and organizing tests.
This section provides quick reference commands for common development tasks. See 'Testing Requirements' below for more details on running and organizing tests.
- Format source code: `pdm run format`
- Run tests: `pdm run test` (see 'Testing Requirements' below for more details)
@@ -44,47 +46,58 @@ This section provides quick reference commands for common development tasks. See
- Derive JSON schema files from Pydantic data model: `pdm run generate-schemas`
- Create platform-specific executable: `pdm run compile`
- Application bootstrap works like this:
```python
```python
pdm run app
|-> executes 'python -m kleinanzeigen_bot'
|-> executes 'kleinanzeigen_bot/__main__.py'
|-> executes main() function of 'kleinanzeigen_bot/__init__.py'
|-> executes KleinanzeigenBot().run()
|-> executes 'python -m kleinanzeigen_bot'
|-> executes 'kleinanzeigen_bot/__main__.py'
|-> executes main() function of 'kleinanzeigen_bot/__init__.py'
|-> executes KleinanzeigenBot().run()
```
## Development Workflow
### Before Submitting
1. **Format your code**: Ensure your code is auto-formatted
```bash
pdm run format
```
2. **Lint your code**: Check for linting errors and warnings
1. **Lint your code**: Check for linting errors and warnings
```bash
pdm run lint
```
3. **Run tests**: Ensure all tests pass locally
1. **Run tests**: Ensure all tests pass locally
```bash
pdm run test
```
4. **Check code quality**: Verify your code follows project standards
- Type hints are complete
- Docstrings are present
- SPDX headers are included
- Imports are properly organized
5. **Test your changes**: Add appropriate tests for new functionality
- Add smoke tests for critical paths
- Add unit tests for new components
- Add integration tests for external dependencies
1. **Check code quality**: Verify your code follows project standards
- Type hints are complete
- Docstrings are present
- SPDX headers are included
- Imports are properly organized
1. **Test your changes**: Add appropriate tests for new functionality
- Add smoke tests for critical paths
- Add unit tests for new components
- Add integration tests for external dependencies
### Commit Messages
Use clear, descriptive commit messages that explain:
- What was changed
- Why it was changed
- Any breaking changes or important notes
Example:
```
```shell
feat: add smoke test for bot startup
- Add test_bot_starts_without_crashing to verify core workflow
@@ -97,11 +110,13 @@ feat: add smoke test for bot startup
This project uses a comprehensive testing strategy with three test types:
### Test Types
- **Unit tests** (`tests/unit/`): Isolated component tests with mocks. Run first.
- **Integration tests** (`tests/integration/`): Tests with real external dependencies. Run after unit tests.
- **Smoke tests** (`tests/smoke/`): Minimal, post-deployment health checks that verify the most essential workflows (e.g., app starts, config loads, login page reachable). Run after integration tests. Smoke tests are not end-to-end (E2E) tests and should not cover full user workflows.
### Running Tests
```bash
# Run all tests in order (unit → integration → smoke)
pdm run test:cov
@@ -118,31 +133,37 @@ pdm run smoke:cov # Smoke tests with coverage
```
### Adding New Tests
1. **Determine test type** based on what you're testing:
- **Smoke tests**: Minimal, critical health checks (not full user workflows)
- **Unit tests**: Individual components, isolated functionality
- **Integration tests**: External dependencies, real network calls
2. **Place in correct directory**:
1. **Place in correct directory**:
- `tests/smoke/` for smoke tests
- `tests/unit/` for unit tests
- `tests/integration/` for integration tests
3. **Add proper markers**:
1. **Add proper markers**:
```python
@pytest.mark.smoke # For smoke tests
@pytest.mark.itest # For integration tests
@pytest.mark.asyncio # For async tests
```
4. **Use existing fixtures** when possible (see `tests/conftest.py`)
1. **Use existing fixtures** when possible (see `tests/conftest.py`)
For detailed testing guidelines, see [docs/TESTING.md](docs/TESTING.md).
## Code Quality Standards
### File Headers
All Python files must start with SPDX license headers:
```python
# SPDX-FileCopyrightText: © <your name> and contributors
# SPDX-License-Identifier: AGPL-3.0-or-later
@@ -150,11 +171,13 @@ All Python files must start with SPDX license headers:
```
### Import Organization
- Use absolute imports for project modules: `from kleinanzeigen_bot import KleinanzeigenBot`
- Use relative imports for test utilities: `from tests.conftest import SmokeKleinanzeigenBot`
- Group imports: standard library, third-party, local (with blank lines between groups)
### Type Hints
- Always use type hints for function parameters and return values
- Use `Any` from `typing` for complex types
- Use `Final` for constants
@@ -163,39 +186,46 @@ All Python files must start with SPDX license headers:
### Documentation
#### Docstrings
- Use docstrings for **complex functions and classes that need explanation**
- Include examples in docstrings for complex functions (see `utils/misc.py` for examples)
#### Comments
- **Use comments to explain your code logic and reasoning**
- Comment on complex algorithms, business logic, and non-obvious decisions
- Explain "why" not just "what" - the reasoning behind implementation choices
- Use comments for edge cases, workarounds, and platform-specific code
#### Module Documentation
- Add module docstrings for packages and complex modules
- Document the purpose and contents of each module
#### Model Documentation
- Use `Field(description="...")` for Pydantic model fields to document their purpose
- Include examples in field descriptions for complex configurations
- Document validation rules and constraints
#### Logging
- Use structured logging with `loggers.get_logger()`
- Include context in log messages to help with debugging
- Use appropriate log levels (DEBUG, INFO, WARNING, ERROR)
- Log important state changes and decision points
#### Timeout configuration
- The default timeout (`timeouts.default`) already wraps all standard DOM helpers (`web_find`, `web_click`, etc.) via `WebScrapingMixin._timeout/_effective_timeout`. Use it unless a workflow clearly needs a different SLA.
- Reserve `timeouts.quick_dom` for transient overlays (shipping dialogs, payment prompts, toast banners) that should render almost instantly; call `self._timeout("quick_dom")` in those spots to keep the UI responsive.
- For single selectors that occasionally need more headroom, pass an inline override instead of creating a new config key, e.g. `custom = self._timeout(override = 12.5); await self.web_find(..., timeout = custom)`.
- Use `_timeout()` when you just need the raw configured value (with optional override); use `_effective_timeout()` when you rely on the global multiplier and retry backoff for a given attempt (e.g. inside `_run_with_timeout_retries`).
- Add a new timeout key only when a recurring workflow has its own timing profile (pagination, captcha detection, publishing confirmations, Chrome probes, etc.). Whenever you add one, extend `TimeoutConfig`, document it in the sample `timeouts:` block in `README.md`, and explain it in `docs/BROWSER_TROUBLESHOOTING.md`.
- Add a new timeout key only when a recurring workflow has its own timing profile (pagination, captcha detection, publishing confirmations, Chrome probes, etc.). Whenever you add one, extend `TimeoutConfig`, document it in the sample `timeouts:` block in `docs/CONFIGURATION.md`, and explain it in `docs/BROWSER_TROUBLESHOOTING.md`.
- Encourage users to raise `timeouts.multiplier` when everything is slow, and override existing keys in `config.yaml` before introducing new ones. This keeps the configuration surface minimal.
#### Examples
```python
def parse_duration(text: str) -> timedelta:
"""
@@ -225,7 +255,9 @@ def parse_duration(text: str) -> timedelta:
# ... handle other units
return timedelta(**kwargs)
```
### Error Handling
- Use specific exception types when possible
- Include meaningful error messages
- Use `pytest.fail()` with descriptive messages in tests
@@ -236,7 +268,9 @@ def parse_duration(text: str) -> timedelta:
We use GitHub issues to track bugs and feature requests. Please ensure your description is clear and has sufficient instructions to be able to reproduce the issue.
### Bug Reports
When reporting a bug, please ensure you:
- Confirm the issue is reproducible on the latest release
- Clearly describe the expected and actual behavior
- Provide detailed steps to reproduce the issue
@@ -247,7 +281,9 @@ When reporting a bug, please ensure you:
This helps maintainers quickly triage and address issues.
### Feature Requests
Include:
- Clear description of the desired feature
- Use case or problem it solves
- Any implementation ideas or considerations
@@ -257,22 +293,23 @@ Include:
Before submitting a pull request, please ensure you:
1. **Work from the latest source on the main branch**
2. **Create a feature branch** for your changes: `git checkout -b feature/your-feature-name`
3. **Format your code**: `pdm run format`
4. **Lint your code**: `pdm run lint`
5. **Run all tests**: `pdm run test`
6. **Check code quality**: Type hints, docstrings, SPDX headers, import organization
7. **Add appropriate tests** for new functionality (smoke/unit/integration as needed)
8. **Write clear, descriptive commit messages**
9. **Provide a concise summary and motivation for the change in the PR**
10. **List all key changes and dependencies**
11. **Select the correct type(s) of change** (bug fix, feature, breaking change)
12. **Complete the checklist in the PR template**
13. **Confirm your contribution can be used under the project license**
1. **Create a feature branch** for your changes: `git checkout -b feature/your-feature-name`
1. **Format your code**: `pdm run format`
1. **Lint your code**: `pdm run lint`
1. **Run all tests**: `pdm run test`
1. **Check code quality**: Type hints, docstrings, SPDX headers, import organization
1. **Add appropriate tests** for new functionality (smoke/unit/integration as needed)
1. **Write clear, descriptive commit messages**
1. **Provide a concise summary and motivation for the change in the PR**
1. **List all key changes and dependencies**
1. **Select the correct type(s) of change** (bug fix, feature, breaking change)
1. **Complete the checklist in the PR template**
1. **Confirm your contribution can be used under the project license**
See the [Pull Request template](.github/PULL_REQUEST_TEMPLATE.md) for the full checklist and required fields.
To submit a pull request:
- Fork our repository
- Push your feature branch to your fork
- Open a pull request on GitHub, answering any default questions in the interface

440
README.md
View File

@@ -4,6 +4,7 @@
[![License](https://img.shields.io/github/license/Second-Hand-Friends/kleinanzeigen-bot.svg?color=blue)](LICENSE.txt)
[![Contributor Covenant](https://img.shields.io/badge/Contributor%20Covenant-v2.1%20adopted-ff69b4.svg)](CODE_OF_CONDUCT.md)
[![codecov](https://codecov.io/github/Second-Hand-Friends/kleinanzeigen-bot/graph/badge.svg?token=SKLDTVWHVK)](https://codecov.io/github/Second-Hand-Friends/kleinanzeigen-bot)
<!--[![Maintainability](https://qlty.sh/badges/69ff94b8-90e1-4096-91ed-3bcecf0b0597/maintainability.svg)](https://qlty.sh/gh/Second-Hand-Friends/projects/kleinanzeigen-bot)-->
**Feedback and high-quality pull requests are highly welcome!**
@@ -20,13 +21,12 @@
1. [Related Open-Source Projects](#related)
1. [License](#license)
For details on the new smoke test strategy and contributor guidance, see [TESTING.md](./docs/TESTING.md).
## <a name="about"></a>About
**kleinanzeigen-bot** is a command-line application to **publish, update, delete, and republish listings** on kleinanzeigen.de.
### Key Features
- **Automated Publishing**: Publish new listings from YAML/JSON configuration files
- **Smart Republishing**: Automatically republish listings at configurable intervals to keep them at the top of search results
- **Bulk Management**: Update or delete multiple listings at once
@@ -49,52 +49,56 @@ Es liegt in Ihrer Verantwortung, die rechtliche Zulässigkeit der Nutzung dieses
Die Entwickler übernehmen keinerlei Haftung für mögliche Schäden oder rechtliche Konsequenzen.
Die Nutzung erfolgt auf eigenes Risiko. Jede rechtswidrige Verwendung ist untersagt.
## <a name="installation"></a>Installation
### Installation using pre-compiled exe
1. The following components need to be installed:
1. [Chromium](https://www.chromium.org/getting-involved/download-chromium), [Google Chrome](https://www.google.com/chrome/),
or Chromium based [Microsoft Edge](https://www.microsoft.com/edge) browser
or Chromium-based [Microsoft Edge](https://www.microsoft.com/edge) browser
1. Open a command/terminal window
1. Download and run the app by entering the following commands:
1. On Windows:
```batch
curl -L https://github.com/Second-Hand-Friends/kleinanzeigen-bot/releases/download/latest/kleinanzeigen-bot-windows-amd64.exe -o kleinanzeigen-bot.exe
kleinanzeigen-bot --help
```
```batch
curl -L https://github.com/Second-Hand-Friends/kleinanzeigen-bot/releases/download/latest/kleinanzeigen-bot-windows-amd64.exe -o kleinanzeigen-bot.exe
kleinanzeigen-bot --help
```
1. On Linux:
```shell
curl -L https://github.com/Second-Hand-Friends/kleinanzeigen-bot/releases/download/latest/kleinanzeigen-bot-linux-amd64 -o kleinanzeigen-bot
chmod 755 kleinanzeigen-bot
```shell
curl -L https://github.com/Second-Hand-Friends/kleinanzeigen-bot/releases/download/latest/kleinanzeigen-bot-linux-amd64 -o kleinanzeigen-bot
./kleinanzeigen-bot --help
```
chmod 755 kleinanzeigen-bot
./kleinanzeigen-bot --help
```
1. On macOS:
```shell
curl -L https://github.com/Second-Hand-Friends/kleinanzeigen-bot/releases/download/latest/kleinanzeigen-bot-darwin-amd64 -o kleinanzeigen-bot
chmod 755 kleinanzeigen-bot
```shell
curl -L https://github.com/Second-Hand-Friends/kleinanzeigen-bot/releases/download/latest/kleinanzeigen-bot-darwin-amd64 -o kleinanzeigen-bot
./kleinanzeigen-bot --help
```
chmod 755 kleinanzeigen-bot
./kleinanzeigen-bot --help
```
### Installation using Docker
1. The following components need to be installed:
1. [Docker](https://www.docker.com/)
1. [Bash](https://www.gnu.org/software/bash/) (on Windows e.g. via [Cygwin](https://www.cygwin.com/), [MSys2](https://www.msys2.org/) or git)
1. [X11 - X Window System](https://en.wikipedia.org/wiki/X_Window_System) display server (on Windows e.g. https://github.com/P-St/Portable-X-Server/releases/latest)
1. [X11 - X Window System](https://en.wikipedia.org/wiki/X_Window_System) display server (on Windows e.g. [Portable-X-Server](https://github.com/P-St/Portable-X-Server/releases/latest))
**Running the docker image:**
1. Ensure the X11 Server is running
1. Run the docker image:
@@ -116,42 +120,53 @@ Die Nutzung erfolgt auf eigenes Risiko. Jede rechtswidrige Verwendung ist unters
### Installation from source
1. The following components need to be installed:
1. [Chromium](https://www.chromium.org/getting-involved/download-chromium), [Google Chrome](https://www.google.com/chrome/),
or Chromium based [Microsoft Edge](https://www.microsoft.com/edge) browser
or Chromium-based [Microsoft Edge](https://www.microsoft.com/edge) browser
1. [Python](https://www.python.org/) **3.10** or newer
1. [pip](https://pypi.org/project/pip/)
1. [git client](https://git-scm.com/downloads)
1. Open a command/terminal window
1. Clone the repo using
```
```bash
git clone https://github.com/Second-Hand-Friends/kleinanzeigen-bot/
```
1. Change into the directory:
```
```bash
cd kleinanzeigen-bot
```
1. Install the Python dependencies using:
```bash
pip install pdm
pdm install
```
1. Run the app:
```
```bash
pdm run app --help
```
### Installation from source using Docker
1. The following components need to be installed:
1. [Docker](https://www.docker.com/)
1. [git client](https://git-scm.com/downloads)
1. [Bash](https://www.gnu.org/software/bash/) (on Windows e.g. via [Cygwin](https://www.cygwin.com/), [MSys2](https://www.msys2.org/) or git)
1. [X11 - X Window System](https://en.wikipedia.org/wiki/X_Window_System) display server (on Windows e.g. https://github.com/P-St/Portable-X-Server/releases/latest)
1. [X11 - X Window System](https://en.wikipedia.org/wiki/X_Window_System) display server (on Windows e.g. [Portable-X-Server](https://github.com/P-St/Portable-X-Server/releases/latest))
1. Clone the repo using
```
```bash
git clone https://github.com/Second-Hand-Friends/kleinanzeigen-bot/
```
@@ -161,7 +176,7 @@ Die Nutzung erfolgt auf eigenes Risiko. Jede rechtswidrige Verwendung ist unters
1. Ensure the image is built:
```
```text
$ docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
second-hand-friends/kleinanzeigen-bot latest c31fd256eeea 1 minute ago 590MB
@@ -169,6 +184,7 @@ Die Nutzung erfolgt auf eigenes Risiko. Jede rechtswidrige Verwendung ist unters
```
**Running the docker image:**
1. Ensure the X11 Server is running
1. Run the docker image:
@@ -187,10 +203,9 @@ Die Nutzung erfolgt auf eigenes Risiko. Jede rechtswidrige Verwendung ist unters
--help
```
## <a name="usage"></a>Usage
```
```console
Usage: kleinanzeigen-bot COMMAND [OPTIONS]
Commands:
@@ -250,19 +265,24 @@ All configuration files can be in YAML or JSON format.
### Installation modes (portable vs. system-wide)
On first run, the app may ask which installation mode to use. In non-interactive environments (CI/headless), it defaults to portable mode and will not prompt; `--config` and `--logfile` override only their specific paths, and do not change other mode-dependent paths or the chosen installation mode behavior.
On first run, the app may ask which installation mode to use. In non-interactive environments (CI/headless), it defaults to portable mode and will not prompt.
The `--config` and `--logfile` flags override only their specific paths. They do not change the chosen installation mode or other mode-dependent paths (downloads, state files, etc.).
1. **Portable mode (recommended for most users, especially on Windows):**
- Stores config, logs, downloads, and state in the current directory
- No admin permissions required
- Easy backup/migration; works from USB drives
2. **System-wide mode (advanced users / multi-user setups):**
1. **System-wide mode (advanced users / multi-user setups):**
- Stores files in OS-standard locations
- Cleaner directory structure; better separation from working directory
- Requires proper permissions for user data directories
**OS notes (brief):**
- **Windows:** System-wide uses AppData (Roaming/Local); portable keeps everything beside the `.exe`.
- **Linux:** System-wide follows XDG Base Directory spec; portable stays in the current working directory.
- **macOS:** System-wide uses `~/Library/Application Support/kleinanzeigen-bot` (and related dirs); portable stays in the current directory.
@@ -274,157 +294,13 @@ When executing the app it by default looks for a `config.yaml` file in the curre
The configuration file to be used can also be specified using the `--config <PATH>` command line parameter. It must point to a YAML or JSON file.
Valid file extensions are `.json`, `.yaml` and `.yml`
The following parameters can be configured:
The configuration file supports many options including login credentials, ad file patterns, browser settings, timeouts, and update check configuration. To generate a default configuration file with all current defaults, run:
```yaml
# yaml-language-server: $schema=https://raw.githubusercontent.com/Second-Hand-Friends/kleinanzeigen-bot/refs/heads/main/schemas/config.schema.json
# glob (wildcard) patterns to select ad configuration files
# if relative paths are specified, then they are relative to this configuration file
ad_files:
- "./**/ad_*.{json,yml,yaml}"
# default values for ads, can be overwritten in each ad configuration file
ad_defaults:
active: true
type: OFFER # one of: OFFER, WANTED
description_prefix: ""
description_suffix: ""
price_type: NEGOTIABLE # one of: FIXED, NEGOTIABLE, GIVE_AWAY, NOT_APPLICABLE
shipping_type: SHIPPING # one of: PICKUP, SHIPPING, NOT_APPLICABLE
# NOTE: shipping_costs and shipping_options must be configured per-ad, not as defaults
sell_directly: false # requires shipping_type SHIPPING to take effect
contact:
name: ""
street: ""
zipcode:
phone: "" # IMPORTANT: surround phone number with quotes to prevent removal of leading zeros
republication_interval: 7 # every X days ads should be re-published
# additional name to category ID mappings, see default list at
# https://github.com/Second-Hand-Friends/kleinanzeigen-bot/blob/main/src/kleinanzeigen_bot/resources/categories.yaml
categories:
Verschenken & Tauschen > Tauschen: 272/273
Verschenken & Tauschen > Verleihen: 272/274
Verschenken & Tauschen > Verschenken: 272/192
# timeout tuning (optional)
timeouts:
multiplier: 1.0 # Scale all timeouts (e.g. 2.0 for slower networks)
default: 5.0 # Base timeout for web_find/web_click/etc.
page_load: 15.0 # Timeout for web_open page loads
captcha_detection: 2.0 # Timeout for captcha iframe detection
sms_verification: 4.0 # Timeout for SMS verification banners
email_verification: 4.0 # Timeout for email verification banners
gdpr_prompt: 10.0 # Timeout when handling GDPR dialogs
login_detection: 10.0 # Timeout for DOM-based login detection fallback (auth probe is tried first)
publishing_result: 300.0 # Timeout for publishing status checks
publishing_confirmation: 20.0 # Timeout for publish confirmation redirect
image_upload: 30.0 # Timeout for image upload and server-side processing
pagination_initial: 10.0 # Timeout for first pagination lookup
pagination_follow_up: 5.0 # Timeout for subsequent pagination clicks
quick_dom: 2.0 # Generic short DOM timeout (shipping dialogs, etc.)
update_check: 10.0 # Timeout for GitHub update requests
chrome_remote_probe: 2.0 # Timeout for local remote-debugging probes
chrome_remote_debugging: 5.0 # Timeout for remote debugging API calls
chrome_binary_detection: 10.0 # Timeout for chrome --version subprocess
retry_enabled: true # Enables DOM retry/backoff when timeouts occur
retry_max_attempts: 2
retry_backoff_factor: 1.5
# download configuration
download:
include_all_matching_shipping_options: false # if true, all shipping options matching the package size will be included
excluded_shipping_options: [] # list of shipping options to exclude, e.g. ['DHL_2', 'DHL_5']
folder_name_max_length: 100 # maximum length for folder names when downloading ads (default: 100)
rename_existing_folders: false # if true, rename existing folders without titles to include titles (default: false)
# publishing configuration
publishing:
delete_old_ads: "AFTER_PUBLISH" # one of: AFTER_PUBLISH, BEFORE_PUBLISH, NEVER
delete_old_ads_by_title: true # only works if delete_old_ads is set to BEFORE_PUBLISH
# captcha-Handling (optional)
# To ensure that the bot does not require manual confirmation after a captcha, but instead automatically pauses for a defined period and then restarts, you can enable the captcha section:
captcha:
auto_restart: true # If true, the bot aborts when a Captcha appears and retries publishing later
# If false (default), the Captcha must be solved manually to continue
restart_delay: 1h 30m # Time to wait before retrying after a Captcha was encountered (default: 6h)
# browser configuration
browser:
# https://peter.sh/experiments/chromium-command-line-switches/
arguments:
# https://stackoverflow.com/a/50725918/5116073
- --disable-dev-shm-usage
- --no-sandbox
# --headless
# --start-maximized
binary_location: # path to custom browser executable, if not specified will be looked up on PATH
extensions: [] # a list of .crx extension files to be loaded
use_private_window: true
user_data_dir: "" # see https://github.com/chromium/chromium/blob/main/docs/user_data_dir.md
profile_name: ""
# update check configuration
update_check:
enabled: true # Enable/disable update checks
channel: latest # One of: latest, prerelease
interval: 7d # Check interval (e.g. 7d for 7 days)
# If the interval is invalid, too short (<1d), or too long (>30d),
# the bot logs a warning and uses a default interval for this run:
# - 1d for 'prerelease' channel
# - 7d for 'latest' channel
# The config file is not changed automatically; please fix your config to avoid repeated warnings.
# login credentials
login:
username: ""
password: ""
# diagnostics (optional) - see "Login Detection Behavior" section below for usage details
diagnostics:
login_detection_capture: false # Capture screenshot + HTML when login state is UNKNOWN
pause_on_login_detection_failure: false # Pause for manual inspection (interactive only)
output_dir: "" # Custom output directory (default: portable .temp/diagnostics, xdg cache/diagnostics)
```bash
kleinanzeigen-bot create-config
```
Slow networks or sluggish remote browsers often just need a higher `timeouts.multiplier`, while truly problematic selectors can get explicit values directly under `timeouts`.
> **Developer Note:** Remember to regenerate the schemas after changing the configuration model so editors stay in sync.
### Login Detection Behavior
The bot uses a **server-side auth probe** to detect login state more reliably:
1. **Auth probe (primary method)**: Sends a GET request to `{root_url}/m-meine-anzeigen-verwalten.json?sort=DEFAULT`
- Returns `LOGGED_IN` if response is HTTP 200 with valid JSON containing `"ads"` key
- Returns `LOGGED_OUT` if response is HTTP 401/403 or HTML contains login markers
- Returns `UNKNOWN` on timeouts, assertion failures, or unexpected response bodies
2. **DOM fallback**: Only consulted when auth probe returns `UNKNOWN`
- Looks for `.mr-medium` element containing username
- Falls back to `#user-email` ID
- Uses `login_detection` timeout (default: 10.0 seconds)
This approach reduces unnecessary re-login attempts because the server-side probe is not affected by client-side rendering delays (SPA hydration) or A/B test variations, though it may return UNKNOWN and fall back to DOM-based checks.
**⚠️ PII Warning:** HTML dumps may contain your account email or other personally identifiable information. Review files in the diagnostics output directory before sharing them publicly.
**Optional diagnostics** help troubleshoot login detection issues:
- Enable `login_detection_capture` to capture screenshots and HTML dumps when state is `UNKNOWN`
- Enable `pause_on_login_detection_failure` to pause the bot for manual inspection (interactive sessions only; requires `login_detection_capture=true`)
- Use custom `output_dir` to specify where artifacts are saved
**Output locations (default):**
- **Portable mode**: `./.temp/diagnostics/`
- **System-wide mode (XDG)**: `~/.cache/kleinanzeigen-bot/diagnostics/` (Linux) or `~/Library/Caches/kleinanzeigen-bot/diagnostics/` (macOS)
- **Custom**: Path resolved relative to your `config.yaml` if `output_dir` is specified
For the complete configuration reference with all available options and detailed explanations, see [Configuration Reference](docs/CONFIGURATION.md).
### <a name="ad-config"></a>2) Ad configuration
@@ -432,216 +308,26 @@ Each ad is described in a separate JSON or YAML file with prefix `ad_<filename>`
Parameter values specified in the `ad_defaults` section of the `config.yaml` file don't need to be specified again in the ad configuration file.
The following parameters can be configured:
For the complete ad configuration reference including automatic price reduction, shipping options, and description prefix/suffix, see [Ad Configuration Reference](docs/AD_CONFIGURATION.md).
```yaml
# yaml-language-server: $schema=https://raw.githubusercontent.com/Second-Hand-Friends/kleinanzeigen-bot/refs/heads/main/schemas/ad.schema.json
active: # true or false (default: true)
type: # one of: OFFER, WANTED (default: OFFER)
title:
description: # can be multiline, see syntax here https://yaml-multiline.info/
### <a name="existing-browser"></a>3) Using an existing browser window
description_prefix: # optional prefix to be added to the description overriding the default prefix
description_suffix: # optional suffix to be added to the description overriding the default suffix
By default a new browser process will be launched. To reuse a manually launched browser window/process, you can enable remote debugging. This is useful for debugging or when you want to keep your browser session open.
# built-in category name as specified in https://github.com/Second-Hand-Friends/kleinanzeigen-bot/blob/main/src/kleinanzeigen_bot/resources/categories.yaml
# or custom category name as specified in config.yaml
# or category ID (e.g. 161/278)
category: # e.g. "Elektronik > Notebooks"
price: # price in euros; decimals allowed but will be rounded to nearest whole euro on processing (prefer whole euros for predictability)
price_type: # one of: FIXED, NEGOTIABLE, GIVE_AWAY (default: NEGOTIABLE)
auto_price_reduction:
enabled: # true or false to enable automatic price reduction on reposts (default: false)
strategy: # "PERCENTAGE" or "FIXED" (required when enabled is true)
amount: # reduction amount; interpreted as percent for PERCENTAGE or currency units for FIXED (prefer whole euros for predictability)
min_price: # required when enabled is true; minimum price floor (use 0 for no lower bound, prefer whole euros for predictability)
delay_reposts: # number of reposts to wait before first reduction (default: 0)
delay_days: # number of days to wait after publication before reductions (default: 0)
# NOTE: All prices are rounded to whole euros after each reduction step.
special_attributes:
# haus_mieten.zimmer_d: value # Zimmer
shipping_type: # one of: PICKUP, SHIPPING, NOT_APPLICABLE (default: SHIPPING)
shipping_costs: # e.g. 2.95 (for individual postage, keep shipping_type SHIPPING and leave shipping_options empty)
# specify shipping options / packages
# it is possible to select multiple packages, but only from one size (S, M, L)!
# possible package types for size S:
# - DHL_2
# - Hermes_Päckchen
# - Hermes_S
# possible package types for size M:
# - DHL_5
# - Hermes_M
# possible package types for size L:
# - DHL_10
# - DHL_20
# - DHL_31,5
# - Hermes_L
shipping_options: []
sell_directly: # true or false, requires shipping_type SHIPPING to take effect (default: false)
# list of wildcard patterns to select images
# if relative paths are specified, then they are relative to this ad configuration file
images:
#- laptop_*.{jpg,png}
contact:
name:
street:
zipcode:
phone: "" # IMPORTANT: surround phone number with quotes to prevent removal of leading zeros
republication_interval: # every X days the ad should be re-published (default: 7)
# The following fields are automatically managed by the bot:
id: # the ID assigned by kleinanzeigen.de
created_on: # ISO timestamp when the ad was first published
updated_on: # ISO timestamp when the ad was last published
content_hash: # hash of the ad content, used to detect changes
repost_count: # how often the ad has been (re)published; used for automatic price reductions
```
#### Automatic price reduction on reposts
When `auto_price_reduction.enabled` is set to `true`, the bot lowers the configured `price` every time the ad is reposted. The starting point for the calculation is always the base price from your ad file (the value of `price`), ensuring the first publication uses the unchanged amount. For each repost the bot subtracts either a percentage of the previously published price (strategy: PERCENTAGE) or a fixed amount (strategy: FIXED) and clamps the result to `min_price`.
**Important:** Price reductions only apply when using the `publish` command (which deletes the old ad and creates a new one). Using the `update` command to modify ad content does NOT trigger price reductions or increment `repost_count`.
`repost_count` is tracked for every ad (and persisted inside the corresponding `ad_*.yaml`) so reductions continue across runs.
`min_price` is required whenever `enabled` is `true` and must be less than or equal to `price`; this makes an explicit floor (including `0`) mandatory. If `min_price` equals the current price, the bot will log a warning and perform no reduction.
**Note:** `repost_count` and price reduction counters are only incremented and persisted after a successful publish. Failed publish attempts do not advance the counters.
**PERCENTAGE strategy example:**
```yaml
price: 150
price_type: FIXED
auto_price_reduction:
enabled: true
strategy: PERCENTAGE
amount: 10
min_price: 90
delay_reposts: 0
delay_days: 0
```
This posts the ad at 150 € the first time, then 135 € (10%), 122 € (10%), 110 € (10%), 99 € (10%), and stops decreasing at 90 €.
**Note:** The bot applies commercial rounding (ROUND_HALF_UP) to full euros after each reduction step. For example, 121.5 rounds to 122, and 109.8 rounds to 110. This step-wise rounding affects the final price progression, especially for percentage-based reductions.
**FIXED strategy example:**
```yaml
price: 150
price_type: FIXED
auto_price_reduction:
enabled: true
strategy: FIXED
amount: 15
min_price: 90
delay_reposts: 0
delay_days: 0
```
This posts the ad at 150 € the first time, then 135 € (15 €), 120 € (15 €), 105 € (15 €), and stops decreasing at 90 €.
**Note on `delay_days` behavior:** The `delay_days` parameter counts complete 24-hour periods (whole days) since the ad was published. For example, if `delay_days: 7` and the ad was published 6 days and 23 hours ago, the reduction will not yet apply. This ensures predictable behavior and avoids partial-day ambiguity.
Set `auto_price_reduction.enabled: false` (or omit the entire `auto_price_reduction` section) to keep the existing behaviour—prices stay fixed and `repost_count` only acts as tracked metadata for future changes.
You can configure `auto_price_reduction` once under `ad_defaults` in `config.yaml`. The `min_price` can be set there or overridden per ad file as needed.
### <a name="description-prefix-suffix"></a>3) Description Prefix and Suffix
You can add prefix and suffix text to your ad descriptions in two ways:
#### New Format (Recommended)
In your config.yaml file you can specify a `description_prefix` and `description_suffix` under the `ad_defaults` section.
```yaml
ad_defaults:
description_prefix: "Prefix text"
description_suffix: "Suffix text"
```
#### Legacy Format
In your ad configuration file you can specify a `description_prefix` and `description_suffix` under the `description` section.
```yaml
description:
prefix: "Prefix text"
suffix: "Suffix text"
```
#### Precedence
The new format has precedence over the legacy format. If you specify both the new and the legacy format in your config, the new format will be used. We recommend using the new format as it is more flexible and easier to manage.
### <a name="existing-browser"></a>4) Using an existing browser window
By default a new browser process will be launched. To reuse a manually launched browser window/process follow these steps:
1. Manually launch your browser from the command line with the `--remote-debugging-port=<NUMBER>` flag.
You are free to choose an unused port number 1025 and 65535, e.g.:
- `chrome --remote-debugging-port=9222`
- `chromium --remote-debugging-port=9222`
- `msedge --remote-debugging-port=9222`
This runs the browser in debug mode which allows it to be remote controlled by the bot.
**⚠️ IMPORTANT: Chrome 136+ Security Requirement**
Starting with Chrome 136 (March 2025), Google has implemented security changes that require `--user-data-dir` to be specified when using `--remote-debugging-port`. This prevents attackers from accessing the default Chrome profile and stealing cookies/credentials.
**You must now use:**
```bash
chrome --remote-debugging-port=9222 --user-data-dir=/path/to/custom/directory
```
**And in your config.yaml:**
```yaml
browser:
arguments:
- --remote-debugging-port=9222
- --user-data-dir=/path/to/custom/directory
user_data_dir: "/path/to/custom/directory"
```
**The bot will automatically detect Chrome 136+ and validate your configuration. If validation fails, you'll see clear error messages with specific instructions on how to fix your configuration.**
1. In your config.yaml specify the same flags as browser arguments, e.g.:
```yaml
browser:
arguments:
- --remote-debugging-port=9222
- --user-data-dir=/tmp/chrome-debug-profile # Required for Chrome 136+
user_data_dir: "/tmp/chrome-debug-profile" # Must match the argument above
```
1. When now publishing ads the manually launched browser will be re-used.
> NOTE: If an existing browser is used all other settings configured under `browser` in your config.yaml file will ignored
because they are only used to programmatically configure/launch a dedicated browser instance.
> **Security Note:** This change was implemented by Google to protect users from cookie theft attacks. The custom user data directory uses a different encryption key than the default profile, making it more secure for debugging purposes.
For detailed instructions on setting up remote debugging with Chrome 136+ security requirements, see [Browser Troubleshooting - Using an Existing Browser Window](docs/BROWSER_TROUBLESHOOTING.md#using-an-existing-browser-window).
### <a name="browser-connection-issues"></a>Browser Connection Issues
If you encounter browser connection problems, the bot includes a diagnostic command to help identify issues:
**For binary users:**
```bash
kleinanzeigen-bot diagnose
```
**For source users:**
```bash
pdm run app diagnose
```
@@ -662,14 +348,14 @@ This command will check your browser setup and provide troubleshooting informati
- [tillvogt/KleinanzeigenScraper](https://github.com/tillvogt/KleinanzeigenScraper) (Python) Webscraper which stores scraped info from kleinanzeigen.de in an SQL database
- [TLINDEN/Kleingebäck](https://github.com/TLINDEN/kleingebaeck) (Go) kleinanzeigen.de Backup
## <a name="license"></a>License
All files in this repository are released under the [GNU Affero General Public License v3.0 or later](LICENSE.txt).
Individual files contain the following tag instead of the full license text:
```
```text
SPDX-License-Identifier: AGPL-3.0-or-later
```
This enables machine processing of license information based on the SPDX License Identifiers that are available here: https://spdx.org/licenses/.
This enables machine processing of license information based on the SPDX License Identifiers that are available here: <https://spdx.org/licenses/>.

314
docs/AD_CONFIGURATION.md Normal file
View File

@@ -0,0 +1,314 @@
# Ad Configuration Reference
Complete reference for ad YAML files in kleinanzeigen-bot.
## File Format
Each ad is described in a separate JSON or YAML file with the default `ad_` prefix (for example, `ad_laptop.yaml`). You can customize the prefix via the `ad_files` pattern in `config.yaml`.
Examples below use YAML, but JSON uses the same keys and structure.
Parameter values specified in the `ad_defaults` section of `config.yaml` don't need to be specified again in the ad configuration file.
## Quick Start
Generate sample ad files using the download command:
```bash
# Download all ads from your profile
kleinanzeigen-bot download --ads=all
# Download only new ads (not locally saved yet)
kleinanzeigen-bot download --ads=new
# Download specific ads by ID
kleinanzeigen-bot download --ads=1,2,3
```
For full JSON schema with IDE autocompletion support, see:
- [schemas/ad.schema.json](../schemas/ad.schema.json)
## Configuration Structure
### Basic Ad Properties
Description values can be multiline. See <https://yaml-multiline.info/> for YAML syntax examples.
```yaml
# yaml-language-server: $schema=https://raw.githubusercontent.com/Second-Hand-Friends/kleinanzeigen-bot/refs/heads/main/schemas/ad.schema.json
active: # true or false (default: true)
type: # one of: OFFER, WANTED (default: OFFER)
title: # Ad title
description: # Ad description
```
### Description Prefix and Suffix
You can add prefix and suffix text to your ad descriptions in two ways:
#### New Format (Recommended)
In your `config.yaml` file you can specify a `description_prefix` and `description_suffix` under the `ad_defaults` section:
```yaml
ad_defaults:
description_prefix: "Prefix text"
description_suffix: "Suffix text"
```
#### Legacy Format
In your ad configuration file you can specify a `description_prefix` and `description_suffix`:
```yaml
description_prefix: "Prefix text"
description_suffix: "Suffix text"
```
#### Precedence
The ad-level setting has precedence over the `config.yaml` default. If you specify both, the ad-level setting will be used. We recommend using the `config.yaml` defaults as it is more flexible and easier to manage.
### Category
Built-in category name, custom category name from `config.yaml`, or category ID.
```yaml
# Built-in category name (see default list at
# https://github.com/Second-Hand-Friends/kleinanzeigen-bot/blob/main/src/kleinanzeigen_bot/resources/categories.yaml)
category: "Elektronik > Notebooks"
# Custom category name (defined in config.yaml)
category: "Verschenken & Tauschen > Tauschen"
# Category ID
category: 161/278
```
### Price and Price Type
```yaml
price: # Price in euros; decimals allowed but will be rounded to nearest whole euro on processing
# (prefer whole euros for predictability)
price_type: # one of: FIXED, NEGOTIABLE, GIVE_AWAY (default: NEGOTIABLE)
```
### Automatic Price Reduction
When `auto_price_reduction.enabled` is set to `true`, the bot lowers the configured `price` every time the ad is reposted.
**Important:** Price reductions only apply when using the `publish` command (which deletes the old ad and creates a new one). Using the `update` command to modify ad content does NOT trigger price reductions or increment `repost_count`.
`repost_count` is tracked for every ad (and persisted inside the corresponding `ad_*.yaml`) so reductions continue across runs.
`min_price` is required whenever `enabled` is `true` and must be less than or equal to `price`; this makes an explicit floor (including `0`) mandatory. If `min_price` equals the current price, the bot will log a warning and perform no reduction.
**Note:** `repost_count` and price reduction counters are only incremented and persisted after a successful publish. Failed publish attempts do not advance the counters.
```yaml
auto_price_reduction:
enabled: # true or false to enable automatic price reduction on reposts (default: false)
strategy: # "PERCENTAGE" or "FIXED" (required when enabled is true)
amount: # Reduction amount; interpreted as percent for PERCENTAGE or currency units for FIXED
# (prefer whole euros for predictability)
min_price: # Required when enabled is true; minimum price floor
# (use 0 for no lower bound, prefer whole euros for predictability)
delay_reposts: # Number of reposts to wait before first reduction (default: 0)
delay_days: # Number of days to wait after publication before reductions (default: 0)
```
**Note:** All prices are rounded to whole euros after each reduction step.
#### PERCENTAGE Strategy Example
```yaml
price: 150
price_type: FIXED
auto_price_reduction:
enabled: true
strategy: PERCENTAGE
amount: 10
min_price: 90
delay_reposts: 0
delay_days: 0
```
This posts the ad at 150 € the first time, then 135 € (10%), 122 € (10%), 110 € (10%), 99 € (10%), and stops decreasing at 90 €.
**Note:** The bot applies commercial rounding (ROUND_HALF_UP) to full euros after each reduction step. For example, 121.5 rounds to 122, and 109.8 rounds to 110. This step-wise rounding affects the final price progression, especially for percentage-based reductions.
#### FIXED Strategy Example
```yaml
price: 150
price_type: FIXED
auto_price_reduction:
enabled: true
strategy: FIXED
amount: 15
min_price: 90
delay_reposts: 0
delay_days: 0
```
This posts the ad at 150 € the first time, then 135 € (15 €), 120 € (15 €), 105 € (15 €), and stops decreasing at 90 €.
#### Note on `delay_days` Behavior
The `delay_days` parameter counts complete 24-hour periods (whole days) since the ad was published. For example, if `delay_days: 7` and the ad was published 6 days and 23 hours ago, the reduction will not yet apply. This ensures predictable behavior and avoids partial-day ambiguity.
Set `auto_price_reduction.enabled: false` (or omit the entire `auto_price_reduction` section) to keep the existing behavior—prices stay fixed and `repost_count` only acts as tracked metadata for future changes.
You can configure `auto_price_reduction` once under `ad_defaults` in `config.yaml`. The `min_price` can be set there or overridden per ad file as needed.
### Special Attributes
Special attributes are category-specific key/value pairs. Use the download command to inspect existing ads in your category and reuse the keys you see under `special_attributes`.
```yaml
special_attributes:
# Example for rental properties
# haus_mieten.zimmer_d: "3" # Number of rooms
```
### Shipping Configuration
```yaml
shipping_type: # one of: PICKUP, SHIPPING, NOT_APPLICABLE (default: SHIPPING)
shipping_costs: # e.g. 2.95 (for individual postage, keep shipping_type SHIPPING and leave shipping_options empty)
# Specify shipping options / packages
# It is possible to select multiple packages, but only from one size (S, M, L)!
# Possible package types for size S:
# - DHL_2
# - Hermes_Päckchen
# - Hermes_S
# Possible package types for size M:
# - DHL_5
# - Hermes_M
# Possible package types for size L:
# - DHL_10
# - DHL_20
# - DHL_31,5
# - Hermes_L
shipping_options: []
# Example (size S only):
# shipping_options:
# - DHL_2
# - Hermes_Päckchen
sell_directly: # true or false, requires shipping_type SHIPPING to take effect (default: false)
```
**Shipping types:**
- `PICKUP` - Buyer picks up the item
- `SHIPPING` - Item is shipped (requires shipping costs or options)
- `NOT_APPLICABLE` - Shipping not applicable for this item
**Sell Directly:**
When `sell_directly: true`, buyers can purchase the item directly through the platform without contacting the seller first. This feature only works when `shipping_type: SHIPPING`.
### Images
List of wildcard patterns to select images. If relative paths are specified, they are relative to this ad configuration file.
```yaml
images:
# - laptop_*.{jpg,png}
```
### Contact Information
Contact details for the ad. These override defaults from `config.yaml`.
```yaml
contact:
name:
street:
zipcode:
phone: "" # IMPORTANT: surround phone number with quotes to prevent removal of leading zeros
```
### Republication Interval
How often the ad should be republished (in days). Overrides `ad_defaults.republication_interval` from `config.yaml`.
```yaml
republication_interval: # every X days the ad should be re-published (default: 7)
```
### Auto-Managed Fields
The following fields are automatically managed by the bot. Do not manually edit these unless you know what you're doing.
```yaml
id: # The ID assigned by kleinanzeigen.de
created_on: # ISO timestamp when the ad was first published
updated_on: # ISO timestamp when the ad was last published
content_hash: # Hash of the ad content, used to detect changes
repost_count: # How often the ad has been (re)published; used for automatic price reductions
```
## Complete Example
```yaml
# yaml-language-server: $schema=https://raw.githubusercontent.com/Second-Hand-Friends/kleinanzeigen-bot/refs/heads/main/schemas/ad.schema.json
active: true
type: OFFER
title: "Example Ad Title"
description: |
This is a multi-line description.
You can add as much detail as you want here.
The bot will preserve line breaks and formatting.
description_prefix: "For sale: " # Optional ad-level override; defaults can live in config.yaml
description_suffix: " Please message if interested!" # Optional ad-level override
category: "Elektronik > Notebooks"
price: 150
price_type: FIXED
auto_price_reduction:
enabled: true
strategy: PERCENTAGE
amount: 10
min_price: 90
delay_reposts: 0
delay_days: 0
shipping_type: SHIPPING
shipping_costs: 4.95
sell_directly: true
images:
- "images/laptop_*.jpg"
contact:
name: "John Doe"
street: "Main Street 123"
zipcode: "12345"
phone: "0123456789"
republication_interval: 7
```
## Best Practices
1. **Use meaningful filenames**: Name your ad files descriptively, e.g., `ad_laptop_hp_15.yaml`
1. **Set defaults in config.yaml**: Put common values in `ad_defaults` to avoid repetition
1. **Test before bulk publishing**: Use `--ads=changed` or `--ads=new` to test changes before republishing all ads
1. **Back up your ad files**: Keep them in version control if you want to track changes
1. **Use price reductions carefully**: Set appropriate `min_price` to avoid underpricing
1. **Check shipping options**: Ensure your shipping options match the actual package size and cost
## Troubleshooting
- **Schema validation errors**: Run `kleinanzeigen-bot verify` (binary) or `pdm run app verify` (source) to see which fields fail validation.
- **Price reduction not applying**: Confirm `auto_price_reduction.enabled` is `true`, `min_price` is set, and you are using `publish` (not `update`). Remember ad-level values override `ad_defaults`.
- **Shipping configuration issues**: Use `shipping_type: SHIPPING` when setting `shipping_costs` or `shipping_options`, and pick options from a single size group (S/M/L).
- **Category not found**: Verify the category name or ID and check any custom mappings in `config.yaml`.
- **File naming/prefix mismatch**: Ensure ad files match your `ad_files` glob and prefix (default `ad_`).
- **Image path resolution**: Relative paths are resolved from the ad file location; use absolute paths and check file permissions if images are not found.

View File

@@ -8,13 +8,15 @@ This guide helps you resolve common browser connection issues with the kleinanze
Google implemented security changes in Chrome 136 that require `--user-data-dir` to be specified when using `--remote-debugging-port`. This prevents attackers from accessing the default Chrome profile and stealing cookies/credentials.
**Quick Fix:**
### Quick Fix
```bash
# Start Chrome with custom user data directory
chrome --remote-debugging-port=9222 --user-data-dir=/tmp/chrome-debug-profile
```
**In your config.yaml:**
### In your config.yaml
```yaml
browser:
arguments:
@@ -32,16 +34,19 @@ For more details, see [Chrome 136+ Security Changes](#5-chrome-136-security-chan
Run the diagnostic command to automatically check your setup:
**For binary users:**
```bash
kleinanzeigen-bot diagnose
```
**For source users:**
```bash
pdm run app diagnose
```
This will check:
- Browser binary availability and permissions
- User data directory permissions
- Remote debugging port status
@@ -52,7 +57,7 @@ This will check:
**Automatic Chrome 136+ Validation:**
The bot automatically detects Chrome/Edge 136+ and validates your configuration. If you're using Chrome 136+ with remote debugging but missing the required `--user-data-dir` setting, you'll see clear error messages like:
```
```console
Chrome 136+ configuration validation failed: Chrome 136+ requires --user-data-dir
Please update your configuration to include --user-data-dir for remote debugging
```
@@ -62,48 +67,59 @@ The bot will also provide specific instructions on how to fix your configuration
### Issue: Slow page loads or recurring TimeoutError
**Symptoms:**
- `_extract_category_from_ad_page` fails intermittently due to breadcrumb lookups timing out
- Captcha/SMS/GDPR prompts appear right after a timeout
- Requests to GitHub's API fail sporadically with timeout errors
**Solutions:**
1. Increase `timeouts.multiplier` in `config.yaml` (e.g. `2.0` doubles every timeout consistently).
2. Override specific keys under `timeouts` (e.g. `pagination_initial: 20.0`) if only a single selector is problematic.
3. Keep `retry_enabled` on so that DOM lookups are retried with exponential backoff.
1. Increase `timeouts.multiplier` in `config.yaml` (e.g., `2.0` doubles every timeout consistently).
1. Override specific keys under `timeouts` (e.g., `pagination_initial: 20.0`) if only a single selector is problematic.
1. For slow email verification prompts, raise `timeouts.email_verification`.
1. Keep `retry_enabled` on so that DOM lookups are retried with exponential backoff.
### Issue: Bot fails to detect existing login session
**Symptoms:**
- Bot re-logins despite being already authenticated
- Intermittent (50/50) login detection behavior
- More common with profiles unused for 20+ days
**How login detection works:**
The bot checks your login status using a fast server request first, with a fallback to checking page elements if needed.
The bot checks your login status using page elements first (to minimize bot-like behavior), with a fallback to a server-side request if needed.
The bot uses a **server-side auth probe** as the primary method to detect login state:
The bot uses a **DOM-based check** as the primary method to detect login state:
1. **Auth probe (preferred)**: Sends a GET request to `{root_url}/m-meine-anzeigen-verwalten.json?sort=DEFAULT`
- Returns `LOGGED_IN` if the response is HTTP 200 with valid JSON containing `"ads"` key
- Returns `LOGGED_OUT` if response is HTTP 401/403 or HTML contains login markers
- Returns `UNKNOWN` on timeouts, assertion failures, or unexpected response bodies
1. **DOM check (preferred - stealthy)**: Checks for user profile elements in the page
2. **DOM fallback**: Only used when the auth probe returns `UNKNOWN`
- Looks for `.mr-medium` element containing username
- Falls back to `#user-email` ID
- Uses the `login_detection` timeout (default: 10.0 seconds with effective timeout with retry/backoff)
- Minimizes bot detection by avoiding JSON API requests that normal users wouldn't trigger
2. **Auth probe fallback (more reliable)**: Sends a GET request to `{root_url}/m-meine-anzeigen-verwalten.json?sort=DEFAULT`
- Returns `LOGGED_IN` if the response is HTTP 200 with valid JSON containing `"ads"` key
- Returns `LOGGED_OUT` if response is HTTP 401/403 or HTML contains login markers
- Returns `UNKNOWN` on timeouts, assertion failures, or unexpected response bodies
- Only used when DOM check is inconclusive (UNKNOWN or timed out)
3. **Diagnostics capture**: If the state remains `UNKNOWN` and `diagnostics.login_detection_capture` is enabled
- Captures a screenshot and HTML dump for troubleshooting
- Pauses for manual inspection if `diagnostics.pause_on_login_detection_failure` is enabled and running in an interactive terminal
- Captures a screenshot and HTML dump for troubleshooting
- Pauses for manual inspection if `diagnostics.pause_on_login_detection_failure` is enabled and running in an interactive terminal
**What `login_detection` controls:**
- Maximum time (seconds) to wait for user profile DOM elements when checking if already logged in
- Default: `10.0` seconds (effective timeout with retry/backoff)
- Used at startup before attempting login
- Note: With the new auth probe, this timeout only applies to the DOM fallback path
- Note: With DOM-first order, this timeout applies to the primary DOM check path
**When to increase `login_detection`:**
- Frequent unnecessary re-logins despite being authenticated
- Slow or unstable network connection
- Using browser profiles that haven't been active for weeks
@@ -111,6 +127,7 @@ The bot uses a **server-side auth probe** as the primary method to detect login
> **⚠️ PII Warning:** HTML dumps captured by diagnostics may contain your account email or other personally identifiable information. Review files in the diagnostics output directory before sharing them publicly.
**Example:**
```yaml
timeouts:
login_detection: 15.0 # For slower networks or old sessions
@@ -127,18 +144,21 @@ diagnostics:
### Issue 1: "Failed to connect to browser" with "root" error
**Symptoms:**
- Error message mentions "One of the causes could be when you are running as root"
- Connection fails when using existing browser profiles
**Causes:**
1. Running the application as root user
2. Browser profile is locked or in use by another process
3. Insufficient permissions to access the browser profile
4. Browser is not properly started with remote debugging enabled
1. Browser profile is locked or in use by another process
1. Insufficient permissions to access the browser profile
1. Browser is not properly started with remote debugging enabled
**Solutions:**
#### 1. Don't run as root
```bash
# ❌ Don't do this
sudo pdm run app publish
@@ -148,6 +168,7 @@ pdm run app publish
```
#### 2. Close all browser instances
```bash
# On Linux/macOS
pkill -f chrome
@@ -160,7 +181,9 @@ taskkill /f /im msedge.exe
```
#### 3. Remove user_data_dir temporarily
Edit your `config.yaml` and comment out or remove the `user_data_dir` line:
```yaml
browser:
# user_data_dir: C:\Users\user\AppData\Local\Microsoft\Edge\User Data # Comment this out
@@ -168,6 +191,7 @@ browser:
```
#### 4. Start browser manually with remote debugging
```bash
# For Chrome (macOS)
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222 --user-data-dir=/tmp/chrome-debug-profile
@@ -189,6 +213,7 @@ chromium --remote-debugging-port=9222 --user-data-dir=/tmp/chromium-debug-profil
```
Then in your `config.yaml`:
```yaml
browser:
arguments:
@@ -197,29 +222,33 @@ browser:
user_data_dir: "/tmp/chrome-debug-profile" # Must match the argument above
```
**⚠️ IMPORTANT: Chrome 136+ Security Requirement**
#### ⚠️ IMPORTANT: Chrome 136+ Security Requirement
Starting with Chrome 136 (March 2025), Google has implemented security changes that require `--user-data-dir` to be specified when using `--remote-debugging-port`. This prevents attackers from accessing the default Chrome profile and stealing cookies/credentials. See [Chrome's security announcement](https://developer.chrome.com/blog/remote-debugging-port?hl=de) for more details.
### Issue 2: "Browser process not reachable at 127.0.0.1:9222"
**Symptoms:**
- Port check fails when trying to connect to existing browser
- Browser appears to be running but connection fails
**Causes:**
1. Browser not started with remote debugging port
2. Port is blocked by firewall
3. Browser crashed or closed
4. Timing issue - browser not fully started
5. Browser update changed remote debugging behavior
6. Existing Chrome instance conflicts with new debugging session
7. **Chrome 136+ security requirement not met** (most common cause since March 2025)
1. Port is blocked by firewall
1. Browser crashed or closed
1. Timing issue - browser not fully started
1. Browser update changed remote debugging behavior
1. Existing Chrome instance conflicts with new debugging session
1. **Chrome 136+ security requirement not met** (most common cause since March 2025)
**Solutions:**
#### 1. Verify browser is started with remote debugging
Make sure your browser is started with the correct flag:
```bash
# Check if browser is running with remote debugging
netstat -an | grep 9222 # Linux/macOS
@@ -227,6 +256,7 @@ netstat -an | findstr 9222 # Windows
```
#### 2. Start browser manually first
```bash
# Start browser with remote debugging
chrome --remote-debugging-port=9222 --user-data-dir=/tmp/chrome-debug
@@ -238,9 +268,12 @@ pdm run app publish # For source users
```
#### 3. macOS-specific: Chrome started but connection fails
If you're on macOS and Chrome is started with remote debugging but the bot still can't connect:
**⚠️ IMPORTANT: This is a Chrome/macOS security issue that requires a dedicated user data directory**
#### ⚠️ IMPORTANT: macOS Security Requirement
This is a Chrome/macOS security issue that requires a dedicated user data directory.
```bash
# Method 1: Use the full path to Chrome with dedicated user data directory
@@ -272,12 +305,14 @@ browser:
```
**Common macOS issues:**
- Chrome/macOS security restrictions require a dedicated user data directory
- The `--user-data-dir` flag is **mandatory** for remote debugging on macOS
- Use `--disable-dev-shm-usage` to avoid shared memory issues
- The user data directory must match between manual Chrome startup and config.yaml
#### 4. Browser update issues
If it worked before but stopped working after a browser update:
```bash
@@ -300,12 +335,14 @@ taskkill /f /im chrome.exe # Windows
```
**After browser updates:**
- Chrome may have changed how remote debugging works
- Security restrictions may have been updated
- Try using a fresh user data directory to avoid conflicts
- Ensure you're using the latest version of the bot
#### 5. Chrome 136+ Security Changes (March 2025)
If you're using Chrome 136 or later and remote debugging stopped working:
**The Problem:**
@@ -323,6 +360,7 @@ chrome --remote-debugging-port=9222 --user-data-dir=/tmp/chrome-debug-profile
```
**In your config.yaml:**
```yaml
browser:
arguments:
@@ -332,22 +370,27 @@ browser:
```
**Why this change was made:**
- Prevents attackers from accessing the default Chrome profile
- Protects cookies and login credentials
- Uses a different encryption key for the custom profile
- Makes debugging more secure
**For more information:**
- [Chrome's security announcement](https://developer.chrome.com/blog/remote-debugging-port?hl=de)
- [GitHub issue discussion](https://github.com/Second-Hand-Friends/kleinanzeigen-bot/issues/604)
#### 5. Check firewall settings
#### 6. Check firewall settings
- Windows: Check Windows Defender Firewall
- macOS: Check System Preferences > Security & Privacy > Firewall
- Linux: Check iptables or ufw settings
#### 6. Use different port
#### 7. Use different port
Try a different port in case 9222 is blocked:
```yaml
browser:
arguments:
@@ -357,6 +400,7 @@ browser:
### Issue 3: Profile directory issues
**Symptoms:**
- Errors about profile directory not found
- Permission denied errors
- Profile locked errors
@@ -364,6 +408,7 @@ browser:
**Solutions:**
#### 1. Use temporary profile
```yaml
browser:
user_data_dir: "/tmp/chrome-temp" # Linux/macOS
@@ -372,6 +417,7 @@ browser:
```
#### 2. Check profile permissions
```bash
# Linux/macOS
ls -la ~/.config/google-chrome/
@@ -382,6 +428,7 @@ chmod 755 ~/.config/google-chrome/
```
#### 3. Remove profile temporarily
```yaml
browser:
# user_data_dir: "" # Comment out or remove
@@ -392,16 +439,19 @@ browser:
### Issue 4: Platform-specific issues
#### Windows
- **Antivirus software**: Add browser executable to exclusions
- **Windows Defender**: Add folder to exclusions
- **UAC**: Run as administrator if needed (but not recommended)
#### macOS
- **Gatekeeper**: Allow browser in System Preferences > Security & Privacy
- **SIP**: System Integrity Protection might block some operations
- **Permissions**: Grant full disk access to terminal/IDE
#### Linux
- **Sandbox**: Add `--no-sandbox` to browser arguments
- **Root user**: Never run as root, use regular user
- **Display**: Ensure X11 or Wayland is properly configured
@@ -409,6 +459,7 @@ browser:
## Configuration Examples
### Basic working configuration
```yaml
browser:
arguments:
@@ -418,6 +469,7 @@ browser:
```
### Using existing browser
```yaml
browser:
arguments:
@@ -428,6 +480,7 @@ browser:
```
### Using existing browser on macOS (REQUIRED configuration)
```yaml
browser:
arguments:
@@ -439,6 +492,7 @@ browser:
```
### Using specific profile
```yaml
browser:
user_data_dir: "C:\\Users\\username\\AppData\\Local\\Google\\Chrome\\User Data"
@@ -450,6 +504,7 @@ browser:
## Advanced Troubleshooting
### Check browser compatibility
```bash
# Test if browser can be started manually
# macOS
@@ -467,6 +522,7 @@ msedge --version
```
### Monitor browser processes
```bash
# Linux/macOS
ps aux | grep chrome
@@ -478,6 +534,7 @@ netstat -an | findstr 9222
```
### Debug with verbose logging
```bash
kleinanzeigen-bot -v publish # For binary users
# or
@@ -485,19 +542,72 @@ pdm run app -v publish # For source users
```
### Test browser connection manually
```bash
# Test if port is accessible
curl http://localhost:9222/json/version
```
## Using an Existing Browser Window
By default a new browser process will be launched. To reuse a manually launched browser window/process, follow these steps:
1. Manually launch your browser from the command line with the `--remote-debugging-port=<NUMBER>` flag.
You are free to choose an unused port number between 1025 and 65535, for example:
- `chrome --remote-debugging-port=9222`
- `chromium --remote-debugging-port=9222`
- `msedge --remote-debugging-port=9222`
This runs the browser in debug mode which allows it to be remote controlled by the bot.
**⚠️ IMPORTANT: Chrome 136+ Security Requirement**
Starting with Chrome 136 (March 2025), Google has implemented security changes that require `--user-data-dir` to be specified when using `--remote-debugging-port`. This prevents attackers from accessing the default Chrome profile and stealing cookies/credentials.
**You must now use:**
```bash
chrome --remote-debugging-port=9222 --user-data-dir=/path/to/custom/directory
```
**And in your config.yaml:**
```yaml
browser:
arguments:
- --remote-debugging-port=9222
- --user-data-dir=/path/to/custom/directory
user_data_dir: "/path/to/custom/directory"
```
**The bot will automatically detect Chrome 136+ and validate your configuration. If validation fails, you'll see clear error messages with specific instructions on how to fix your configuration.**
1. In your config.yaml specify the same flags as browser arguments, for example:
```yaml
browser:
arguments:
- --remote-debugging-port=9222
- --user-data-dir=/tmp/chrome-debug-profile # Required for Chrome 136+
user_data_dir: "/tmp/chrome-debug-profile" # Must match the argument above
```
1. When now publishing ads the manually launched browser will be re-used.
> NOTE: If an existing browser is used all other settings configured under `browser` in your config.yaml file will be ignored
> because they are only used to programmatically configure/launch a dedicated browser instance.
>
> **Security Note:** This change was implemented by Google to protect users from cookie theft attacks. The custom user data directory uses a different encryption key than the default profile, making it more secure for debugging purposes.
## Getting Help
If you're still experiencing issues:
1. Run the diagnostic command: `kleinanzeigen-bot diagnose` (binary) or `pdm run app diagnose` (source)
2. Check the log file for detailed error messages
3. Try the solutions above step by step
4. Create an issue on GitHub with:
1. Check the log file for detailed error messages
1. Try the solutions above step by step
1. Create an issue on GitHub with:
- Output from the diagnose command
- Your `config.yaml` (remove sensitive information)
- Error messages from the log file
@@ -508,8 +618,8 @@ If you're still experiencing issues:
To avoid browser connection issues:
1. **Don't run as root** - Always use a regular user account
2. **Close other browser instances** - Ensure no other browser processes are running
3. **Use temporary profiles** - Avoid conflicts with existing browser sessions
4. **Keep browser updated** - Use the latest stable version
5. **Check permissions** - Ensure proper file and folder permissions
6. **Monitor system resources** - Ensure sufficient memory and disk space
1. **Close other browser instances** - Ensure no other browser processes are running
1. **Use temporary profiles** - Avoid conflicts with existing browser sessions
1. **Keep browser updated** - Use the latest stable version
1. **Check permissions** - Ensure proper file and folder permissions
1. **Monitor system resources** - Ensure sufficient memory and disk space

296
docs/CONFIGURATION.md Normal file
View File

@@ -0,0 +1,296 @@
# Configuration Reference
Complete reference for `config.yaml`, the main configuration file for kleinanzeigen-bot.
## Quick Start
To generate a default configuration file with all current defaults:
```bash
kleinanzeigen-bot create-config
```
For full JSON schema with IDE autocompletion support, see:
- [schemas/config.schema.json](../schemas/config.schema.json)
To enable IDE autocompletion in `config.yaml`, add this at the top of the file:
```yaml
# yaml-language-server: $schema=schemas/config.schema.json
```
For ad files, use the ad schema instead:
```yaml
# yaml-language-server: $schema=schemas/ad.schema.json
```
## File Location
The bot looks for `config.yaml` in the current directory by default. You can specify a different location using the `--config` command line option:
```bash
kleinanzeigen-bot --config /path/to/config.yaml publish
```
Valid file extensions: `.json`, `.yaml`, `.yml`
## Configuration Structure
### ad_files
Glob (wildcard) patterns to select ad configuration files. If relative paths are specified, they are relative to this configuration file.
```yaml
ad_files:
- "./**/ad_*.{json,yml,yaml}"
```
### ad_defaults
Default values for ads that can be overridden in each ad configuration file.
```yaml
ad_defaults:
active: true
type: OFFER # one of: OFFER, WANTED
description_prefix: ""
description_suffix: ""
price_type: NEGOTIABLE # one of: FIXED, NEGOTIABLE, GIVE_AWAY, NOT_APPLICABLE
shipping_type: SHIPPING # one of: PICKUP, SHIPPING, NOT_APPLICABLE
# NOTE: shipping_costs and shipping_options must be configured per-ad, not as defaults
sell_directly: false # requires shipping_type SHIPPING to take effect
contact:
name: ""
street: ""
zipcode: ""
phone: "" # IMPORTANT: surround phone number with quotes to prevent removal of leading zeros
republication_interval: 7 # every X days ads should be re-published
```
> **Tip:** For current defaults of all timeout and diagnostic settings, run `kleinanzeigen-bot create-config` or see the [JSON schema](../schemas/config.schema.json).
### categories
Additional name to category ID mappings. See the default list at:
[https://github.com/Second-Hand-Friends/kleinanzeigen-bot/blob/main/src/kleinanzeigen_bot/resources/categories.yaml](https://github.com/Second-Hand-Friends/kleinanzeigen-bot/blob/main/src/kleinanzeigen_bot/resources/categories.yaml)
```yaml
categories:
Verschenken & Tauschen > Tauschen: 272/273
Verschenken & Tauschen > Verleihen: 272/274
Verschenken & Tauschen > Verschenken: 272/192
```
### timeouts
Timeout tuning for various browser operations. Adjust these if you experience slow page loads or recurring timeouts.
```yaml
timeouts:
multiplier: 1.0 # Scale all timeouts (e.g. 2.0 for slower networks)
default: 5.0 # Base timeout for web_find/web_click/etc.
page_load: 15.0 # Timeout for web_open page loads
captcha_detection: 2.0 # Timeout for captcha iframe detection
sms_verification: 4.0 # Timeout for SMS verification banners
email_verification: 4.0 # Timeout for email verification prompts
gdpr_prompt: 10.0 # Timeout when handling GDPR dialogs
login_detection: 10.0 # Timeout for DOM-based login detection (primary method)
publishing_result: 300.0 # Timeout for publishing status checks
publishing_confirmation: 20.0 # Timeout for publish confirmation redirect
image_upload: 30.0 # Timeout for image upload and server-side processing
pagination_initial: 10.0 # Timeout for first pagination lookup
pagination_follow_up: 5.0 # Timeout for subsequent pagination clicks
quick_dom: 2.0 # Generic short DOM timeout (shipping dialogs, etc.)
update_check: 10.0 # Timeout for GitHub update requests
chrome_remote_probe: 2.0 # Timeout for local remote-debugging probes
chrome_remote_debugging: 5.0 # Timeout for remote debugging API calls
chrome_binary_detection: 10.0 # Timeout for chrome --version subprocess
retry_enabled: true # Enables DOM retry/backoff when timeouts occur
retry_max_attempts: 2
retry_backoff_factor: 1.5
```
**Timeout tuning tips:**
- Slow networks or sluggish remote browsers often just need a higher `timeouts.multiplier`
- For truly problematic selectors, override specific keys directly under `timeouts`
- Keep `retry_enabled` on so DOM lookups are retried with exponential backoff
For more details on timeout configuration and troubleshooting, see [Browser Troubleshooting](./BROWSER_TROUBLESHOOTING.md).
### download
Download configuration for the `download` command.
```yaml
download:
include_all_matching_shipping_options: false # if true, all shipping options matching the package size will be included
excluded_shipping_options: [] # list of shipping options to exclude, e.g. ['DHL_2', 'DHL_5']
folder_name_max_length: 100 # maximum length for folder names when downloading ads (default: 100)
rename_existing_folders: false # if true, rename existing folders without titles to include titles (default: false)
```
### publishing
Publishing configuration.
```yaml
publishing:
delete_old_ads: "AFTER_PUBLISH" # one of: AFTER_PUBLISH, BEFORE_PUBLISH, NEVER
delete_old_ads_by_title: true # only works if delete_old_ads is set to BEFORE_PUBLISH
```
### captcha
Captcha handling configuration. Enable automatic restart to avoid manual confirmation after captchas.
```yaml
captcha:
auto_restart: true # If true, the bot aborts when a Captcha appears and retries publishing later
# If false (default), the Captcha must be solved manually to continue
restart_delay: 1h 30m # Time to wait before retrying after a Captcha was encountered (default: 6h)
```
### browser
Browser configuration. These settings control how the bot launches and connects to Chromium-based browsers.
```yaml
browser:
# See: https://peter.sh/experiments/chromium-command-line-switches/
arguments:
# Example arguments
- --disable-dev-shm-usage
- --no-sandbox
# --headless
# --start-maximized
binary_location: # path to custom browser executable, if not specified will be looked up on PATH
extensions: [] # a list of .crx extension files to be loaded
use_private_window: true
user_data_dir: "" # see https://github.com/chromium/chromium/blob/main/docs/user_data_dir.md
profile_name: ""
```
**Common browser arguments:**
- `--disable-dev-shm-usage` - Avoids shared memory issues in Docker environments
- `--no-sandbox` - Required when running as root (not recommended)
- `--headless` - Run browser in headless mode (no GUI)
- `--start-maximized` - Start browser maximized
For detailed browser connection troubleshooting, including Chrome 136+ security requirements and remote debugging setup, see [Browser Troubleshooting](./BROWSER_TROUBLESHOOTING.md).
### update_check
Update check configuration to automatically check for newer versions on GitHub.
```yaml
update_check:
enabled: true # Enable/disable update checks
channel: latest # One of: latest, preview
interval: 7d # Check interval (e.g. 7d for 7 days)
```
**Interval format:**
- `s`: seconds, `m`: minutes, `h`: hours, `d`: days
- Examples: `7d` (7 days), `12h` (12 hours), `30d` (30 days)
- Validation: minimum 1 day, maximum 30 days
**Channels:**
- `latest`: Only final releases
- `preview`: Includes pre-releases
### login
Login credentials.
```yaml
login:
username: ""
password: ""
```
> **Security Note:** Never commit your credentials to version control. Keep your `config.yaml` secure and exclude it from git if it contains sensitive information.
### diagnostics
Diagnostics configuration for troubleshooting login detection issues.
```yaml
diagnostics:
login_detection_capture: false # Capture screenshot + HTML when login state is UNKNOWN
pause_on_login_detection_failure: false # Pause for manual inspection (interactive only)
output_dir: "" # Custom output directory (default: portable .temp/diagnostics, xdg cache/diagnostics)
```
**Login Detection Behavior:**
The bot uses a layered approach to detect login state, prioritizing stealth over reliability:
1. **DOM check (primary method - preferred for stealth)**: Checks for user profile elements
- Looks for `.mr-medium` element containing username
- Falls back to `#user-email` ID
- Uses `login_detection` timeout (default: 10.0 seconds)
- Minimizes bot-like behavior by avoiding JSON API requests
2. **Auth probe fallback (more reliable)**: Sends a GET request to `{root_url}/m-meine-anzeigen-verwalten.json?sort=DEFAULT`
- Returns `LOGGED_IN` if response is HTTP 200 with valid JSON containing `"ads"` key
- Returns `LOGGED_OUT` if response is HTTP 401/403 or HTML contains login markers
- Returns `UNKNOWN` on timeouts, assertion failures, or unexpected response bodies
- Only used when DOM check is inconclusive (UNKNOWN or timed out)
**Optional diagnostics:**
- Enable `login_detection_capture` to capture screenshots and HTML dumps when state is `UNKNOWN`
- Enable `pause_on_login_detection_failure` to pause the bot for manual inspection (interactive sessions only; requires `login_detection_capture=true`)
- Use custom `output_dir` to specify where artifacts are saved
**Output locations (default):**
- **Portable mode**: `./.temp/diagnostics/`
- **System-wide mode (XDG)**: `~/.cache/kleinanzeigen-bot/diagnostics/` (Linux) or `~/Library/Caches/kleinanzeigen-bot/diagnostics/` (macOS)
- **Custom**: Path resolved relative to your `config.yaml` if `output_dir` is specified
> **⚠️ PII Warning:** HTML dumps may contain your account email or other personally identifiable information. Review files in the diagnostics output directory before sharing them publicly.
## Installation Modes
On first run, the app may ask which installation mode to use.
1. **Portable mode (recommended for most users, especially on Windows):**
- Stores config, logs, downloads, and state in the current directory
- No admin permissions required
- Easy backup/migration; works from USB drives
2. **System-wide mode (advanced users / multi-user setups):**
- Stores files in OS-standard locations
- Cleaner directory structure; better separation from working directory
- Requires proper permissions for user data directories
**OS notes:**
- **Windows:** System-wide uses AppData (Roaming/Local); portable keeps everything beside the `.exe`.
- **Linux:** System-wide follows XDG Base Directory spec; portable stays in the current working directory.
- **macOS:** System-wide uses `~/Library/Application Support/kleinanzeigen-bot` (and related dirs); portable stays in the current directory.
## Getting Current Defaults
To see all current default values, run:
```bash
kleinanzeigen-bot create-config
```
This generates a config file with `exclude_none=True`, giving you all the non-None defaults.
For the complete machine-readable reference, see the [JSON schema](../schemas/config.schema.json).

33
docs/INDEX.md Normal file
View File

@@ -0,0 +1,33 @@
# Documentation Index
This directory contains detailed documentation for kleinanzeigen-bot users and contributors.
## User Documentation
- [Configuration](./CONFIGURATION.md) - Complete reference for `config.yaml`, including all configuration options, timeouts, browser settings, and update check configuration.
- [Ad Configuration](./AD_CONFIGURATION.md) - Complete reference for ad YAML files, including automatic price reduction, description prefix/suffix, and shipping options.
- [Browser Troubleshooting](./BROWSER_TROUBLESHOOTING.md) - Troubleshooting guide for browser connection issues, including Chrome 136+ security requirements, remote debugging setup, and common solutions.
## Contributor Documentation
Contributor documentation is located in the main repository:
- [CONTRIBUTING.md](../CONTRIBUTING.md) - Development setup, workflow, code quality standards, testing requirements, and contribution guidelines.
- [TESTING.md](./TESTING.md) - Detailed testing strategy, test types (unit/integration/smoke), and execution instructions for contributors.
## Getting Started
New users should start with the [README](../README.md), then refer to these documents for detailed configuration and troubleshooting information.
### Quick Start (3 steps)
1. Install and run the app from the [README](../README.md).
2. Generate `config.yaml` with `kleinanzeigen-bot create-config` and review defaults in [Configuration](./CONFIGURATION.md).
3. Verify your setup with `kleinanzeigen-bot verify`, then publish with `kleinanzeigen-bot publish`.
### Common Troubleshooting Tips
- Browser connection issues: confirm remote debugging settings and Chrome 136+ requirements in [Browser Troubleshooting](./BROWSER_TROUBLESHOOTING.md).

View File

@@ -39,14 +39,15 @@ This project uses a layered testing approach, with a focus on reliability and fa
- All smoke tests **must** be marked with `@pytest.mark.smoke`.
- Place smoke tests in `tests/smoke/` for discoverability.
- Example:
```python
import pytest
@pytest.mark.smoke
@pytest.mark.asyncio
async def test_bot_starts(smoke_bot):
...
```
```python
import pytest
@pytest.mark.smoke
@pytest.mark.asyncio
async def test_bot_starts(smoke_bot):
...
```
### Running Smoke, Unit, and Integration Tests

View File

@@ -1,95 +0,0 @@
# Update Check Feature
## Overview
The update check feature automatically checks for newer versions of the bot on GitHub. It supports two channels:
- `latest`: Only final releases
- `prerelease`: Includes pre-releases
## Configuration
```yaml
update_check:
enabled: true # Enable/disable update checks
channel: latest # One of: latest, prerelease
interval: 7d # Check interval (e.g. 7d for 7 days)
```
### Interval Format
The interval is specified as a number followed by a unit:
- `s`: seconds
- `m`: minutes
- `h`: hours
- `d`: days
Examples:
- `7d`: Check every 7 days
- `12h`: Check every 12 hours
- `30d`: Check every 30 days
Validation rules:
- Minimum interval: 1 day (`1d`)
- Maximum interval: 30 days (`30d`, roughly 4 weeks)
- Value must be positive
- Only supported units are allowed
## State File
The update check state is stored in `.temp/update_check_state.json`. The file format is:
```json
{
"version": 1,
"last_check": "2024-03-20T12:00:00+00:00"
}
```
### Fields
- `version`: Current state file format version (integer)
- `last_check`: ISO 8601 timestamp of the last check (UTC)
### Migration
The state file supports version migration:
- Version 0 to 1: Added version field
- Future versions will be migrated automatically
### Timezone Handling
All timestamps are stored in UTC:
- When loading:
- Timestamps without timezone are assumed to be UTC
- Timestamps with timezone are converted to UTC
- When saving:
- All timestamps are converted to UTC before saving
- Timezone information is preserved in ISO 8601 format
### Edge Cases
The following edge cases are handled:
- Missing state file: Creates new state file
- Corrupted state file: Creates new state file
- Invalid timestamp format: Logs warning, uses current time
- Permission errors: Logs warning, continues without saving
- Invalid interval format: Logs warning, performs check
- Interval too short/long: Logs warning, performs check
## Error Handling
The update check feature handles various error scenarios:
- Network errors: Logs error, continues without check
- GitHub API errors: Logs error, continues without check
- Version parsing errors: Logs error, continues without check
- State file errors: Logs error, creates new state file
- Permission errors: Logs error, continues without saving
## Logging
The feature logs various events:
- Check results (new version available, up to date, etc.)
- State file operations (load, save, migration)
- Error conditions (network, API, parsing, etc.)
- Interval validation warnings
- Timezone conversion information