Resources

Security that scales: our approach to credential detection

Written by David Edery | May 16, 2025 4:11:47 PM

Credentials leaks happen all the time. Whether it's an API key, a private token, or some random password stashed in your code, the risks of accidentally exposing sensitive information in your repositories are always lurking. 

How do you tackle this problem before it becomes a breach? How do you embed leak prevention into your team’s workflows? In this blog post, we'll walk through how we here at Via introduced secret detection into our Continuous Integration (CI) pipeline, the tools we tried, the regex nightmares we encountered, and the wins along the way.

Where our secret detection journey begins.

As we embarked on this journey, we quickly realized that integrating secret detection into our CI pipeline would be more than just a technical upgrade. It would require a shift in mindset and processes across our development teams.

The pre-commit option.

One option we considered was using pre-commit hooks. Pre-commit enforces developers to run checks locally before committing and pushing changes from their workstations to our central Gitlab instance. But here’s the twist: pre-commit can be bypassed locally by the developer.  Although developers may sometimes have good reasons to bypass pre-commit—such as bypassing unrelated noisy pre-commits they have configured in their repos—this possibility leaves a crack in the armor that makes pre-commit an insufficient solution.

Why CI pipelines are ideal for secret detection.

At Via, we already run Continuous Integration pipelines on every commit, so it’s the ideal spot to automate checks and catch those sensitive secrets before they sneak into production. Since CI jobs run automatically, secrets are scanned in a centralized, consistent way, minimizing the risk of anything going unnoticed. At Via, we use GitLab CI, and the methodology discussed here could be applied to many industry CI tools.

By implementing secret detection into our CI jobs, we aim to minimize the risk of human error and keep our codebase clean and our secrets, well, secret 🙂.

However, CI pipelines still face a challenge: once a secret is committed, it stays in the Git history forever—even if it's never merged from a developer’s scratchpad branch into main/master. The secret doesn’t just vanish because it wasn’t merged. It remains in the commit history. If caught early, there’s a way to erase it before it reaches the main/master branch. Our detection mechanism catches these secrets in time, giving you the chance to act and remove that secret from Git history before it’s too late.

How it works. 

When a secret sneaks into a commit, our detection mechanism flags it and notifies the developer (more on this later). This gives them the chance to remove the secret right there in the branch, before merging it to the main repo. Now, if the developer squashes commits in the merge request, they bundle up all those individual changes (including the one with the leaked secret) into a single, clean commit. Squashing is like hitting “undo” on the leak, restoring order to the Git history.

But what if squashing is missed during the merge? That’s where we come prepared with a backup plan. We’ve built a Jenkins job that lets developers enter the leaked secret, and it goes through the Git history to scrub out any trace of it. Think of it as our last line of defense, protecting our history even if a squash didn’t happen.

As long as we detect the secret, remove it, and squash before merging, that sneaky secret is gone from history, and we can breathe easy. 

This involves using tools like Gitleaks and Trufflehog for automated scans with every commit, which we’ll explore further later in this post.

Tooling dilemma.

The first step was figuring out which secret detection tool to use. We started by benchmarking several open-source solutions, with Gitleaks and Trufflehog being the frontrunners.

When to use Gitleaks:

Let’s say you’re looking for a lightweight, quick solution just to catch an API key someone accidentally committed. That’s where Gitleaks shines. It’s the kind of tool that fits into your CI/CD pipelines or pre-commit hooks—quick, reliable, and won’t slow you down. If you’re into creating custom regex patterns (because who doesn’t want to create the perfect pattern for finding secrets?), then this is your jam.

When to use Trufflehog:

Now, if you’re more of a “leave no stone unturned” type, Trufflehog is the pick for you. This tool digs deep. It doesn’t just look for obvious patterns. Trufflehog uses entropy-based search, which enables it to find secrets no one knew were secrets.

The catch? It’s going to throw some false positives your way. But hey, it’s thorough, and sometimes you’ve got to tolerate a bit of noise to catch the sneaky stuff.

Our choice: 

Gitleaks won because of the ability to customize detection rules, and its speed are both essential factors given the many jobs already running in our CI pipeline.

While Trufflehog offered more in-depth detection capabilities, the setup effort required was a significant drawback. Gitleaks provided a more straightforward implementation that allowed us to integrate secret detection efficiently into our workflow without disrupting our development processes. Its balance of functionality and ease of use made it the clear winner for our team.

Customizing gitleaks for our organization.

Gaps:

Gitleaks has solid built-in regexes for spotting common secrets (like AWS keys, GitHub tokens etc..), but every organization has its quirks. We needed to go beyond the defaults to catch our own mix of secrets without getting flooded by irrelevant matches.

Solution:

First, we took a deep dive into the types of secrets spread across teams like: API keys, tokens, internal credentials. Then came the fun part: crafting custom regex patterns. Yes, it was as tricky as it sounds. If you've ever worked with regex before, you know how much of a headache it can be: now we had two problems. We had to fine-tune our patterns to avoid false positives (i.e., things that looked like secrets but weren’t) and false negatives (secrets that could slip through undetected).

To ensure this integration was as seamless as possible for developers, we leveraged Gitleaks’ gitleaks:allow tag. If a false positive is triggered once, developers can mark them, and Gitleaks will ignore this false positive in future scans. 

We review the gitleaks:allow list on a monthly basis to check if any true positive secrets slipped through, ensuring our repositories are clean from secrets.

Example: custom regex for passwords in URLs.

Here’s one of the custom rules we made, focused on spotting passwords in URLs:

By adding rules like this, we made Gitleaks a tool we can trust and actually use, helping catch our unique risks without alert fatigue.

Applying Gitleaks to all teams.

Once we nailed down the rules, we wanted to make secret detection available to all projects without forcing teams to manually set it up. To streamline the process, we created a GitLab CI runner pre-configured with Gitleaks and all our custom rules baked in.

Why a runner, and not just a script rule on a generic base image? By setting up a runner with everything pre-installed, our CI runs much faster since the image has all the relevant packages ready to go. In fact, with the pre-configured runner, we’re running scans in just ~11 seconds—that’s 5x faster than the 30-60 seconds it used to take with a generic base image.

This means faster feedback, fewer delays, and a more efficient CI pipeline for everyone.

Gradually integrating into the CI pipeline.

Rolling out secret detection across all teams wasn't an overnight process. We introduced it gradually, starting with a few pilot teams. During this phase, we ran Gitleaks in “allow_failure” mode. This meant it flagged potential issues without blocking the CI pipeline, allowing teams to adjust to the new tool without any disruption.

Feedback was invaluable during this phase. We learned which secret rules were useful and which were just noise. After some fine-tuning, we made secret detection a mandatory step in the CI pipeline once we were confident in the setup. By then, we had reduced false positives and made sure the checks were accurate enough to avoid overwhelming developers with unnecessary alerts.

Automating alerts and guiding remediation.

When a hard-coded secret is detected, it's crucial to notify the right people quickly. To streamline this, we set up an automated Slack alert system that notifies the developer who committed the secret. The alert provides key information such as the repository, and a direct link to the exact line of code containing the secret.

Each Slack message also includes a link to a detailed Confluence page. This page provides clear instructions on how to address both false positives and true positives, ensuring that developers know exactly what steps to take. By providing this structured guidance, we’ve reduced the back-and-forth and helped teams resolve secrets quickly. 

Secret validation and resolution workflow.

To ensure secrets are fully addressed, a validation process runs the day after a valid secret is detected in the CI pipeline. This process checks whether the hardcoded secret has been removed. If any secrets remain in code, a Slack channel is automatically created to notify the responsible parties, including the developer and their manager. This prompt communication ensures quick resolution, keeping the codebase secure and minimizing delays.

Blocking new secrets.

As we mentioned earlier, we kicked things off by running Gitleaks in allow_failure mode—which is a "log mode" in GitLab CI where the pipeline wouldn’t crash just because it found a secret. During this time, we set up Slack notifications that pinged developers with all the details whenever a hard-coded secret was flagged, so they were always in the loop. After keeping a close eye on the feedback and how things were running, we felt pretty good about our setup being able to catch most false positives. So, we decided it was time to flip the switch (remove the allow_failure) and start blocking new secrets from slipping in. This was a big deal for us in tightening up our CI security, making sure no secrets could sneak through the cracks.

Conclusion: a step toward better security.

By adding secret detection to our CI pipeline, we took a big step toward enhancing security across the board. Was it perfect from day one? Not. But by rolling it out gradually, we were able to adjust and fine-tune without causing too much disruption. The results show promising indications that secrets are better contained.

We've seen a significant decrease of ~55% in the number of secrets found in committed code, with the remainder shrinking steadily each month. This shows how effective the process has been in raising awareness and improving security over time.

Stay tuned for the next post, where we'll dive into how we built a full secret detection and remediation process in Jira.