summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAge
* Merge remote-tracking branch 'origin/no-crc32'HEADwebstat-0.4mainDan Goodliffe4 days
|\
| * Swap int for integer in schemaDan Goodliffe4 days
| | | | | | | | Plays better with apgdiff
| * Save point only if there are new entitiesDan Goodliffe6 days
| | | | | | | | Line insert is only a single operation with no new entities.
| * 4 fields is more than enough for Entity to be a fully-fledged typeDan Goodliffe6 days
| |
| * Replace use of crc32 for entity idDan Goodliffe7 days
| | | | | | | | | | Entity value is MD5 hashed same as DB unique key, but the id itself is now taken from the DB primary key which is sequence generated.
| * Introduce MD5 from libmd, use it for hashing queuedLines for park pathDan Goodliffe10 days
| |
| * Return path of parked lines log file from parkQueuedLogLinesDan Goodliffe11 days
|/ | | | Or the last errno on failure.
* Parse escaping in query stringswebstat-0.3.1Dan Goodliffe2026-03-27
|
* Revise stats and add signal handlers to log them and reset themwebstat-0.3Dan Goodliffe2026-03-25
| | | | Also logs them on main loop exit.
* Employ temporary/short files to handle errors reading/writing park logsDan Goodliffe2026-03-22
|
* Add missing -WshadowDan Goodliffe2026-03-22
|
* Add logging :-oDan Goodliffe2026-03-20
| | | | | | | Adds virtual log function, real implementation writes to syslog. Test implementation writes to BOOST_TEST_MESSAGE, perf implementation discards. Replaces existing prints to stderr and adds logs to all key points.
* Insert log entries in batchesDan Goodliffe2026-03-20
| | | | | | | | | | Store log lines in memory until threshold is reach or idle occurs, then insert all the lines in a single transaction. Save points handle the case of insertion errors. On success the queue is cleared. Parked lines also saved in bulk, only necessary if queued lines could not be inserted on shutdown, else the queue simply grows until ability to insert is restored. Importing parked lines just adds them to the queue and the normal process then follows.
* Gracefully handle SIGTERMDan Goodliffe2026-03-19
| | | | | Apache sends SIGTERM to the logger process to it shutdown. Honestly I thought it would just close stdin and I should have checked.
* Count and return the number of parked lines ingestedDan Goodliffe2026-03-18
|
* Replace unique constraint on entity value with index on hashDan Goodliffe2026-03-18
| | | | | | | UNIQUE CONSTRAINT is limited to 2704 bytes, which prevents inserting large values. Here we swap to a unique index on the MD5 hash of the value. This should more than suffice given we already map to a 32bit for the id and the index size is much much smaller.
* Fix typo in access_log_view definitionDan Goodliffe2026-03-18
| | | | Replaces accidentally duplicated user_agent for correct content_type.
* Use std::future over std::thread for background jobsDan Goodliffe2026-03-17
| | | | | Easier checking if a job has completed [successfully] and reseting state for the next time.
* Don't start new curl operations outside the main threadDan Goodliffe2026-03-17
| | | | | Neither the curl handle, not the operation map is thread safe. This isn't ideal, but it does solve the problem in a safe manor.
* Execute jobs even when processing incoming logsDan Goodliffe2026-03-17
| | | | | Jobs run on background threads now, so we can happily run them even when we're busy.
* Run jobs on a background threadDan Goodliffe2026-03-17
|
* Process new field, content-type, in input streamDan Goodliffe2026-01-18
|
* Attempt to save uninsertable log lines to the entities tableDan Goodliffe2026-01-17
| | | | | | If that fails, we still park them as before, such as when the DB is unavailable. Those which are saved as entities require investigation why they couldn't be saved, much like UnparsableLines.
* pg_format schema.sql and sql/*.sqlDan Goodliffe2026-01-17
| | | | No changes.
* Add job for puring old access log entries from the databaseDan Goodliffe2025-12-20
|
* Add a few no lint commentsDan Goodliffe2025-12-20
|
* Add support for configuring frequency of parked line jobDan Goodliffe2025-12-20
|
* Add utility for parsing an ISO like durationDan Goodliffe2025-12-20
|
* Rename test utilities to avoid name conflict.Dan Goodliffe2025-12-20
|
* Replace that awful magic number heavy mapping functionDan Goodliffe2025-10-16
| | | | | Now a tuple of mapping functors and we pass each value through its corresponding converter.
* Refactor handling of new entity insertDan Goodliffe2025-10-15
| | | | | Replaces weird select with one thing with a function pointer stored in the type definition array.
* Update comments on custom_log formatDan Goodliffe2025-10-15
|
* Add access_log_viewDan Goodliffe2025-10-15
| | | | | A pre-joined with entities view showing all the original data along with ids; ideal for human readable stuff.
* Allows handle curl things if there are anywebstat-0.2.2Dan Goodliffe2025-10-10
|
* Fix premature remembering of saved entity idswebstat-0.2.1Dan Goodliffe2025-10-09
| | | | | | | Don't persist entity ids saved to the DB until the transaction is committed. Prevents the issue where a later DB operation fails, the transaction is rolled back, but we still think the entity has been saved.
* Fix up QuotedString/CLFString parsingDan Goodliffe2025-10-09
| | | | | | | | Refactors CLFString in terms of QuotedString, but with the optional of being null (nullopt) Moves the whole decode function into QuotedString's parser, fixing support for escaping of " which would otherwise prematurely end the string in the middle.
* Add PROPFIND to http_verb listDan Goodliffe2025-10-09
|
* Add parked line import jobwebstat-0.2Dan Goodliffe2025-10-06
| | | | Periodically, on idle, scan for and import previously parked lines.
* Add point to execute scheduled jobs when idleDan Goodliffe2025-10-02
|
* Switch to PostgreSQL's oid type for entity idsDan Goodliffe2025-09-30
| | | | | oid is an "unsigned 4 byte integer", which matches our crc32 approach perfectly, and is half the storage cost of bigint.
* Write log lines to files on errorDan Goodliffe2025-09-30
| | | | | We call this parking, later we can reattempt ingestion after whatever caused the failure has been fixed.
* Create settings structureDan Goodliffe2025-09-24
| | | | | | | Holds all the settings and their defaults for use in program_options and tests. Disables missing-field-initializers in tests because its over sensitive to structures with defaults where you only provide some values specifically.
* Write unparsable lines to the entity tableDan Goodliffe2025-09-23
| | | | Diagnostics and the ability to ingest later.
* Make DB pool protected for access from unit testsDan Goodliffe2025-09-23
|
* Gracefully handle fatal exceptions and display messageDan Goodliffe2025-09-19
|
* Create and perform UA lookup curl op when new user agent is encounteredDan Goodliffe2025-09-13
|
* Perform background curl operations when not processing log inputDan Goodliffe2025-09-13
|
* Always pass API URL to curlGetUserAgentDetailDan Goodliffe2025-09-13
|
* Simplify storeEntities with bindManyDan Goodliffe2025-09-10
|
* Use curl_multi_poll in main ingestLog loopDan Goodliffe2025-09-10
| | | | Preparation step for having background curl operations.