Secrets scanning leakage of tokens and API keys: we find leaked credentials in GitHub, GitLab, Bitbucket and Jira

Depov

Activist
ULTIMATE
SUPREME
PREMIUM
MEMBER
Joined
Feb 18, 2025
Messages
128
Reaction score
116
Deposit
0$
AWS-key format AKIA* in the git-history of the private repositoryremoved from the code eight months ago, but aws stsget-caller-identity returns a live IAM account with access to threeS3-backets. From in the Finding Old Committing to Read DataTwenty minutes. This scenario is repeated from the project to theproject when auditing cloud environments: secrets are removed fromthe HEAD branches, but they forget about the history of the comets, forksand artifacts of CI / CD. At the same time, to soteri.iomore than half of those leaked to GitHub credentials - genericsecrets without structured format, which is standard regex rules do notcatch. That is, if the scanner is standing, he is silent.


The Place ofleaked secrets in the chain of attack


Search hardcodedsecrets is not an isolated technique, but a full-component stage ofkill chain, which connects intelligence with access to the cloudinfrastructure. Decomposed by MITRE AT&CK.






Reconnaissance.Scanning of public repositories for leaked keys - equipment codeRepositories (T1593.003). The attacker uses the GitHub Search API,search drills and automodenters to detect open secrets. This isthe first step, and it's free.






Credential Access.The secrets found are classified by type: Credentials In Files(T1552.001) for API keys and passwords in configuration files,Private Keys (T1552.004) for SSH and TLS keys, Steal ApplicationAccess Token (T1528) for OAuto tokens and service accountcredentials.






Initial Access andPrivilege Escalation. Valid cloud-token is a direct path to CloudAccounts (T1078.004), which terms the tactics ofinitial access, stability and privilege escalation. One leaked GCPservice account JSON with the role of Owner - and your project as awhole.






Lateral Movementand Collection. Through compromised tokens, the attacker movethrough the infrastructure: Application Access Token (T1550.001) forCalebal movement, Code Repositories (T1213.003) to collect data fromprivate repositories, Data from Cloud Storage (T1530) to accessS3/GCS/Azure Blob.






The Context ofapplication: this chain works on both an external pent (scanning)public repositories) and on internal (access to the private repo ofthe organization through adcompement account). On theexternal - start with T1593.003. On the domestic, when it isEncage more to GitLab / Bitbucket - immediately go to T1213.003and T1552.001.


Where to Look forleaked tokens and API keys in cloud environments


Git repositories:the history of comits keeps everything


The most overdoseand richest source. GitHub secret scanning scans “The allHistory of Git on All Agencies of the Repository (Caccording tothe documentation). The attacker does the same thing. Key searchPoints:


• Historyof Commits and Deleted Branches. Team git log -p --all -S'AKIA' for a string in all the companies available throughbranches and tags. But -alldoes not cover unreachable (dangling)Terms - they need git log --all --reflog -p -S 'AKIA'or git fsck-lost-found. Such commits live before garbage collection (git gc),and not everyone is asked.


• •Configuration files. .env, config.yaml, appsettings.json, Terraformand CloudFormation patterns are standard where developersleave credentials during local testing. Added Even .envin.giignoredoes not delete the file from the more existing- this nuance is missed by by.


• Pullrequest and issue. GitHub scans descriptions and comments in issues,PR, Discussions and even wikis. Attacking too: requestorg:target-company filename:.envorgorg:target-comany"API_KEY"through the GitHub Search API returns the resultsin seconds.


• Forksand gists. The Public Fork of the Private Repository inherits theall history, including secrets. Secret gists are indexed andavailable by the direct link - read, public.


CI/CD pipelinesand non-git platforms


Secrets are notjust through the code. Typical for Azure DeVOps and GitHub Enterpriseinclude pipeline secret leakage, YAML injections in self-hostedAges and theft of OAuth application tokens. In GCP - leaks of theSigned URL and Correct Configuration of IAM.


• •Bitbucket Pipelines and GitLab CI. Variable environmentsafter the command line fall into the logs ofassembly. The body artifacts can contain .envfiles that are saved asdownloaded archives. I have seen this on projects more than - oncethe developer thinks that the pipleline "inside" and theartifacts are downloaded by anyone with access to the project.


• Jiraand Confluence. Developers set set files, screenshotswith tokens, a dump of errors with Authorization headers. Search byJira text ~ "AKIA" OR text ~ "Bearer" OR text ~"sk_live" - the first thing to do in an internal audit.


• Logsand the monitoring. Developers add a debugging output (console.log, thelogging of HTTP headers) and forget to delete it. Secrets fall inKibana, Grafana, cloud-based log storing - often with access forof the works. Classics: Authorization: Bearersk_live_...directly in Kibana, available to the entire department.


Secrets scanningin repositories: ttufflehog, gitalaks, detect-secrets


Comparison ofInstruments and Diversity


Three toolsdominate the open-source segment. Their architecture is fundamentallydifferent, and the choice of the script.
1779820213549.png

Gileaks (currentreleases 8.x, see github.com/gitleaks/gitleaks/releases; on writtenGo, active support) - my choice for a quick scan of the historyof the repository. Putted through brew install gitleaks on macOS ordownloading binary for Linux. Key team: gitalaks detect --source .--log-opts=""---al" --verbose - scans all the hands of allbranches. Fast, urgling to, configitative.






detect-secrets(relevant releases 1.x, see github.com/Yelp/detect-secrets/releases;From Yelp, Python) - another approach: the set of a baselinefile through detect-secrets scan > .secrets.baseline, then throughdetect-secrets audit .secrets.baseline you are interactively markingeach find as true/false positive. Further aletitis is only new.Suitable for long-term monitoring, but do not have built-inintegration for iteration by guitar history (unlike gitleaks--log-opts); on focused sync filesystem snapshot and workingthrough pre-commit.






ttufflehog - theonly one of the three who knows how to the found secretes,refer to the API provider (AWS, GitHub, Slack, etc.). On thepentest, save this hours of manual verification, and createon the side of the provider. If the customer asks for a hiddenaudit, verification traffic will give your activity. Keep in mind.


Restrictions oftools: what will allow all three


None of theinstruments are a silver bullet. Specifics where theStandard change will fail:


• •Base64-encoded secrets. Line QUtJQQ1Njc4OUFCQ0RFC0RFK0Rkc=- it'sAKIA1234567890ABDEDEFGin base64. Most regex patterns do not decodebase64 before checking. The developer put the key in .env.examplein acoded form - the scanner will be silent.


• Customformats of tokens. According to soteri.io, more than half of theleaked credentials on GitHub - generic secrets without acharacteristic prefix (not AKIA*, not sk_live_) Entropy filters helpbut false calation on hashes, UUID and compressed data.Typical protocols: for base64-stroke entropy > 4.5, for hex - >3.0, for alphanumeric - > 3.7. Select the project by hand.


• Secretsin binary files. Compiled artifacts, Docker images, archives -outside of text scanning. We need separate tools for containeranalysis (trivy, gype).


• •Non-standard platforms. Jira-investment, Confluence-page, SlackCannes gitalaks, nor detect-secrets cover them. Here is amanual search or integration through the APIs. Automation isnot normal again.


Search API keys inrepositories: step-by-step workflowtest pener


Adjustments to theEnvironment


• OS:Linux (Kali, Ubuntu 22.04+) or macOS. Windows - via WSL2


• RAM: 4GB minimum, 8 GB recommended is for scanning large repositories(100k+ of comits)


• Tools:git 2.30+, gitelaks 8.x (Go binary), Python 3.9+ for detect-secrets,jq for JSON reporting parsing


• NetworkPress: Internet for cloning repositories and verificationkeys. For a hidden performance audit - checking through a VPN orproxy


• Access:eek:n the external pentest - public repositories, GitHub Search API(Search API require authentication; limiting - 30 req/min forauthenticated requests; general REST API: 60 req/hour without token,5000 req/hour with token. On internal - access toGitLab/Bitbucket/Jira organizations


Validation offound credentials


The Secret Foundis useless without validation - it can be a test key, long-revokedtoken or placeholder from the documentation. Validation Desteation onthe type:


• AWSAccess Key (AKIA*) Perform AWS_ACCESS_KEY_ID_AKIA...AWS_SECRET_ACCICE_KEY=... aws sts get-caller-identity- if the answerARN and Account ID, the key is live. Further aws iamlist-attached-user-posites show the rights of the account. Thisa record in CloudTrail - when consider a hidden audit.


• GitHubPersonal Access Token (ghp_*) Request curl -H "Authorization:token hp_..." https://api.github.com/user- if 200, the token isvalid. Check the scope after the title X-OAuth-Scopesin Response.Sometimes tokens with repopand admin:eek:rg- jackpot.


• SlackWebhook URL. Send POST with test payload to URL of the lookhttps://hooks.slack.com/services/T... /B... /...- if 200, webhook isActive and you to send messages to the channel.


• GCPService Account JSON. Authenticate through gclud authactivate-service-account --key-file=key.json, then gcloud projectslistwill showcase projects.


Any validationdocument: type of secret, source (file, company, platform),verification, access scum. This will fall into the report andDefinition of the Relationship.


What will light upin the logs: OPSEC when scanning secrets


With acloud-pentest, it is important to what understand your auditLises. Without this understanding, you can trigger the SOC on thefirst validation.


• GitHubAdvanced Security News Windows When Secrets Are Found in Pustif you're testing push protection, administrators will ar.email notification. Push Protection is doced withTypes of Tokens (according to GitHub documentation) andadministrators see security bypass events.


• TheGitLab Secret Detection works in two modes: push protection (blocksthe business to get into the repository) and pipeline scanning. Bothregimes are degnating the results. GitLab Duo on the Ultimatetariff repayments false positives through AI analysis, assigninga confidence score to find each.


• •CloudTrail / Cloud Audit Logs. Use the AWS Key set a record inCloudTrail (UpituitityContains accessKeyId), which allows SOC todetect anomaly on source IP. In GCP - application each call fromthe service account is recorded in the Cloud Audit Logs. If the SOCis tuned to the first use of credentials from the new IP– your verification of the incident. On one project, thisis really what: one key was checked, after 15 minutes aCall came from the customer's security.


• GitHubAPI rate limiting. GitHub Search API requires authentication; ratelimit for authenticated requests - 30 requests per minute. A sharpvposing in requests from one IP can attention.


Recommendations:eek:n the external pentest, verification through a VPN, for eachprovider, use a minimally invasive read-only point end. On theInternal - to agree with the customer, what are permissibleand what SOC-allerators can be in the first.


Timetable forSteree-Based in Cloud Environments


Numbered list ofActions - you can transfer to the customer as part of the report:


1. Clone allavailable repositories of the organization (including archival)a complete history (git clone)


2. Launchgitalaks with --log-opts=""--alform with each repository, usingthe standard rules + custom .gtileaks.tomls of the specifics of theCustomer


3. CheckGitHub/GitLab/Bitbucket Search API for secrets in issue, PR, wiki


4. Check Jiraand Confluence: text search characteristic by patterns (AKIA)sk_live_, ------BEGIN, Bearer, ghp_, glpat-)


5. Check CI/CDPiplines: Assembrus Tags, Artificbs, Interventions of EnvironmentWithout Masking


6. Validizeevery secret found through a read-only endpoint provider


7. Determinethe discount of access for each live secret (rights, ratings, data)


8. Checkwhether repositories are scanned by the built-in platform (GitHub)Secret Scanning, GitLab Secret Detection


9. Check thepresence of pre-commit hucks in .pre-commit-config.yaml- ourHundreds free to go down in history


10. Document:secret, source, date of the commit, author, status (live/revoked),scrap - for find each


Public incidentsof large cloud show providers that under the largest companies are notinsured. The scale of the consequences is determined by thepresence of a scanner, but by the speed of the rectalsecret.






The main problem Isee from the project to the project is the gap between andresponse. Organizations are introducing giteaks in CI/CD, get areport with finds in the git history and ... do nothing.The secret is formally “removed from the code”but not withdraw from the provider, the story is not cleared throughgitfilter-repo or BFG Repo-Cleaner, and forks with a full copy ofhistory continue to live your lives.






The tools forscanning secrets in repositories today are quite planned: regex +entropy + verification cover the absolute majority of the structuredformats. The weak link is not the detection technology, but therotational process. In the next couple of years, the volume of leakedCredentials will grow in the growth to the growth of microserviceArchitects and the number of service. Much of the cloudincidents will begin with the secret that the scanner found, and thedid not collect. The only working approach is AutomaticRotation (AWS Secrets Manager, HashiCorp Vault) in conjunction withPre-commit hooks and CI/CD scanning. Without rotation, anythingAnother is reports for the sake of reports.
 
Top Bottom