Introduction
If you read the maintenance article I wrote some months ago, you already know why regular maintenance matters and how to automate basic housekeeping. But there is an uncomfortable truth: when your database has grown large, especially the action_logs table, a simple purge can turn into a long-running, locking, and timeout-prone operation that could make your instance unusable until it finishes.
This is a continuation but also focuses on doing the risky part properly which is cleaning action logs in a controlled, granular way and optionally reclaiming disk space at the database level.
I already use your maintenance script, am I concerned?
If you are already using the maintenance script on a regular basis, you are probably not affected. The main reason for this follow up article are feedbacks where people didn’t expect their server to be unusable because they were running a low retention for a server with 1M+ action logs. This article is there to help the administrators that never maintain that, and who are using passbolt for a few years now.
But mate, why do action logs become a problem?
On all instances (and obviously, especially on busy ones), the action_logs table can quickly accumulate hundreds of thousands to millions of rows. We log everything, from a simple authentication to copying a secret to even logout, everything is tracked for security purposes. This typically leads to:
- Slower queries and degraded responsiveness, especially on the first login that display all the resources.
- More disk pressure because even if you delete rows, InnoDB won’t necessarily return space to the OS immediately.
- Potential operational risks since large deletion operations can lock, stall or timeout resulting in long downtime for your production environment.
The important nuance is deleting rows and reclaiming disk space are two different operations. Deleting makes the data go away logically where reclaiming space requires a table rebuild. The golden rule out of all the tickets we received so far is: Plan a maintenance window. It’s better to assume downtime if you’re about to delete a large volume of logs e.g., 1M+.
During a heavy purge, you can expect your server to become unresponsive or extremely slow until operations complete. This varies and depends on hardware (IOPS, CPU), database settings, and table sizes. This can be anywhere from minutes to hours.
Before proceeding, you must have:
- A fresh snapshot of the virtual machine or a verified full backup.
- A clearly communicated maintenance window because users may be impacted.
- If you can, a rehearsal on a pre-production environment.
Is there a good purging strategy ?
If a dry-run (or if you are used to playing with magic: with a rough estimation based on table size) suggest a very large deletion, please don’t purge everything in one run. Large operations can fail for "boring" reasons such as lock contention, max execution time in wrappers, cron environments, systemd limits etc.
Instead, what I would invite you to do, is to go granularly. The idea is to run multiple purges with decreasing retention thresholds, while increasing how much each run can delete. This reduces the chance of a single giant transaction and lets you stop safely if the window is closing. Starting far in the past first means you immediately attack the biggest accumulation and you avoid doing a massive “catch-up” purge later.
To start the maintenance, follow my previous article until chapter “Updating script permissions” included. After that, we’ll go step by step from there:
sudo ./passbolt_maintenance.sh -r 730 --limit 1000000
sudo ./passbolt_maintenance.sh -r 455 --limit 1000000
sudo ./passbolt_maintenance.sh -r 365 --limit 1000000
sudo ./passbolt_maintenance.sh -r 180 --limit 1000000
sudo ./passbolt_maintenance.sh -r 90 --limit 1000000
We’re running the purge in several passes on purpose. When your action_logs table is huge, asking the database to delete “everything older than X” in a single run can become a very large transaction e.g., long lock times, heavy IO, and a higher chance of timeouts or the whole instance feeling frozen until it completes. In the example, I used 2 years (-r 730) because it’s a common “worst offender” range in real-world tickets: servers that have been running for a few years without maintenance usually have a lot of noise in the 1 or 2 year bucket.
That being said, if you are already using Passbolt for more than that e.g., 4 years, well first thank you we really appreciate that but also, you can start with higher retention e.g., -r 1460 for 4 years and then work your way down. On the contrary, if your instance is only less than a year old, starting at a retention window of 730 doesn’t add value, start closer to how long the instance has been running.
TL;DR start with “How long has the instance existed without purging ?” then step down progressively until your target retention.
Do I need to reclaim disk space after that?
To be totally transparent, sometimes yes but often no. After purging, performance usually improves because the table is smaller logically and queries have fewer rows to scan, but disk usage at the filesystem level may not drop because InnoDB typically marks freed pages as reusable within the table rather than returning them to the OS. If you need to reclaim the disk space e.g., you’re hitting storage limits, you can definitely rebuild the table using OPTIMIZE TABLE.
To do that, it’s relatively easy, but remember what we're referencing before? The maintenance window. If your maintenance window is not reliable anymore, you need to schedule another maintenance window, the reasons are:
- It will lock the action_logs table during the operation, since I told you above that every action is logged, this will obviously make passbolt unusable at all for the duration.
- This can take 30 to 60+ minutes (sometimes even more) depending on the table size and hardware
- You may need roughly up to table size worth of additional free disk space temporarily because the rebuild copies data during optimization.
Space reclaimer, you have been warned! To proceed, connect to the passbolt database and run:
OPTIMIZE TABLE action_logs;
If you don’t urgently need space back on disk, you can skip this chapter as the main operational win is deleting the bloat, space reclaim is just storage housekeeping.
Post maintenance validation
After the purge, you want to verify two outcomes:
- Is passbolt responsive again?
- Does login and core operations work?
- Do you see some improvements in terms of speedness?
- Did the row counts and/or table sizes have dropped materially?
Depending on your environment, you can confirm the table size changes in MariaDB using information_schema or simply compare disk usage of the database directory before and after, again bear in mind that deletes may not shrink files until an optimize.
Conclusion
Action logs are great for forensic analysis and for detailed information such as lastLoggedIn, lastLoggedOut etc. However, depending on your company’s policies you may not need N years old logs especially to keep consistency, they become one of the most common causes of “Wait, why does my passbolt feel slow?” and also “Why is the server disk filling up too quickly?” The mistake is treating log cleanup like a harmless cron task when the dataset is already massive, the safe approach is operationally boring: schedule downtime, purge progressively with higher limits, and then possibly run an OPTIMIZE if you truly need to reclaim disk space. This is how you clean aggressively without gambling your production instance.
What’s next?
You did the most painful thing, now you can continue reading the chapter “Automate with cron” to automate the maintenance script on a regular basis.
If you are already using the maintenance script and have any feedback on it or simply confirm it’s helping you in your daily work routine, don’t hesitate to share it on our community forum, we’d love to hear from you!