2026-01-21 22:37:13 +01:00
# log_ingest
2026-01-21 22:34:48 +01:00
2026-01-21 22:37:13 +01:00
A Rust CLI tool for loading log files into a SQLite database for analysis.
2026-01-21 22:34:48 +01:00
## Overview
2026-01-21 22:37:13 +01:00
Parses application logs containing signature messages and loads them into SQLite for querying. Designed to handle large log volumes (10GB+ per day) with batched inserts and efficient parsing.
2026-01-21 22:34:48 +01:00
2026-01-21 22:37:13 +01:00
## Features
2026-01-21 22:34:48 +01:00
2026-01-21 22:37:13 +01:00
- Parse `signature:` messages extracting app info, device details, and feature flags
- Support for both plain `.log` and gzip compressed `.log.gz` files
- File discovery by date range using `YYYY/mm/dd` directory structure
- Batched inserts for performance with large files
- Indexed columns (`session_id` , `version` ) for efficient queries
- Extensible parser architecture for adding new message types
2026-01-21 22:34:48 +01:00
2026-01-21 22:37:13 +01:00
## Installation
2026-01-21 22:34:48 +01:00
2026-01-21 22:37:13 +01:00
```bash
cargo build --release
2026-01-21 22:34:48 +01:00
```
## Usage
2026-01-21 22:37:13 +01:00
### Process a single file
2026-01-21 22:34:48 +01:00
```bash
2026-01-21 22:37:13 +01:00
log_ingest --file /path/to/logs.log --output output.db
2026-01-21 22:34:48 +01:00
```
2026-01-21 22:37:13 +01:00
### Process a date range
2026-01-21 22:34:48 +01:00
```bash
2026-01-21 22:37:13 +01:00
log_ingest \
--from 2026/01/20 \
--to 2026/01/21 \
--base-dir /var/log/myapp \
--filename app.log \
--output output.db
2026-01-21 22:34:48 +01:00
```
2026-01-21 22:37:13 +01:00
The tool will look for files at `<base-dir>/YYYY/MM/DD/<filename>.gz` or `<base-dir>/YYYY/MM/DD/<filename>` for each day in the range.
### Options
| Option | Description |
|--------|-------------|
| `--file <PATH>` | Single log file to process |
| `--from <DATE>` | Start date (YYYY/mm/dd) |
| `--to <DATE>` | End date (YYYY/mm/dd) |
| `--base-dir <PATH>` | Base directory containing log files |
| `--filename <NAME>` | Log filename (e.g., `app.log` ) |
| `-o, --output <PATH>` | Output SQLite database path |
| `--batch-size <N>` | Batch size for inserts (default: 10000) |
## Database Schema
```sql
CREATE TABLE signature_entries (
id INTEGER PRIMARY KEY AUTOINCREMENT,
session_id TEXT NOT NULL,
timestamp TEXT NOT NULL,
app TEXT NOT NULL,
version TEXT NOT NULL,
offline_login_usage INTEGER NOT NULL,
is_password_autofill_enabled INTEGER NOT NULL,
camera_roll_usage INTEGER NOT NULL,
os TEXT NOT NULL,
app_name TEXT NOT NULL,
touch_id INTEGER NOT NULL,
is_offline_login_enabled INTEGER NOT NULL,
model TEXT NOT NULL,
device TEXT NOT NULL,
password_autofill_usage INTEGER NOT NULL
);
CREATE INDEX idx_session_id ON signature_entries(session_id);
CREATE INDEX idx_version ON signature_entries(version);
```
2026-01-21 22:34:48 +01:00
2026-01-21 22:37:13 +01:00
## Example Queries
2026-01-21 22:34:48 +01:00
2026-01-21 22:37:13 +01:00
```sql
-- Percentage of users with password autofill enabled
SELECT
ROUND(100.0 * SUM(is_password_autofill_enabled) / COUNT( * ), 2) as pct
FROM signature_entries;
-- Count by app version
SELECT version, COUNT(*) as cnt
FROM signature_entries
GROUP BY version
ORDER BY cnt DESC;
-- Device breakdown
SELECT device, COUNT(*) as cnt
FROM signature_entries
GROUP BY device;
2026-01-21 22:34:48 +01:00
```
2026-01-21 22:37:13 +01:00
## Development
2026-01-21 22:34:48 +01:00
2026-01-21 22:37:13 +01:00
```bash
# Build
cargo build
2026-01-21 22:34:48 +01:00
2026-01-21 22:37:13 +01:00
# Run tests
cargo test
2026-01-21 22:34:48 +01:00
2026-01-21 22:37:13 +01:00
# Format
cargo fmt
2026-01-21 22:34:48 +01:00
2026-01-21 22:37:13 +01:00
# Lint
cargo clippy
```
2026-01-21 22:34:48 +01:00
2026-01-21 22:37:13 +01:00
## License
2026-01-21 22:34:48 +01:00
2026-01-21 22:37:13 +01:00
MIT