A less sophisticated but nevertheless effective attack vector
We are becoming accustomed to attackers gaining the upper hand due to basic security oversights. We already know that phishing tactics as unsophisticated as they look became one of the most effective one.
Now, in the web, we have different basic security oversights. But the most prevalent is exposing private files, mainly ones that contain various credentials or API keys. Recently the Internet Archive was breached due to an exposed GitLab configuration file.
Mass Heist: 15,000 Cloud Credentials Stolen via Exposed Git Configs
The Sysdig Threat Research Team (TRT) uncovered this whale of an operation they named suggestively: EMERALDWHALE.
The attackers used various tools to scan and extract reconfigured web services, allowing them to then steal the credentials, access the private repo and extract from their source code other sensitive information, such as cloud credentials. In this campaign the attackers managed to steal credentials from over 10,000 private repositories, and the stoled data was kept in a S3 bucket of a victim.
TRT found also that used various legitimate vulnerability search and scanning tools (Shodan, Masscan). In general the attack seemed like a minimal effort thing, using free tools but one that granted results.
Also it seem like the most effective tactic to search for misconfigured or exposed credentials was the web scraping technique.
Target and victim analysis
TRT also managed to extract logging data from the S3 bucket left by EMERALDWHALE. The data includes targeting lists, tool output, and raw data collected.
IP Addresses: 500M+
IP Ranges: 12k
Domains: 500k
EC2 hostnames: ~1M
A very interesting threat research, read in more details here.