Threat Intelligence “Provenance” Could be the Solution to a Fractured Global Cybersecurity Landscape

Geopolitical tensions threaten cybersecurity research and sharing between countries, but research from Georgia Tech demonstrates a possible system of auditable provenance data to validate how threat intelligence was produced instead of trusting who produced it.

In January, China allegedly announced a sweeping ban on cybersecurity companies such as VMware, Palo Alto Networks, and Fortinet, and even Google.

The US previously has banned cybersecurity firm Kaspersky over national security concerns.

These bans threaten the open sharing of threat intelligence information between defenders. When a new piece of malware can reach a global scale within minutes, it’s important not to let national borders hinder the sharing of vital data. When the enemy is global, you need the defense to be global as well.

But quality research isn’t suddenly worthless just because it comes from a geopolitical adversary: so having some way of verifying the quality of the research aside from the country of origin would be valuable.

As threat intelligence data moves through the complex ecosystem, it’s unclear how useful or accurate the data even is.

Scientists in a DARPA-funded study developed a method of tracking the propagation of threat intelligence data through the ecosystem by embedding unique watermarks into benign files and tracking them as they moved between actors.

The study found that while 67% of vendors perform dynamic malware analysis, only 17% shared the intelligence they extracted. Malicious URLs were shared 20 times more frequently than the actual malware itself, so defenders were taking conclusions of other vendors without the evidence needed to back them up.

The sharing of data could also be delayed for hours or days, putting people’s security at risk. The researchers even found that malware was using the publicly available list of IP addresses of sandbox environments used by researchers to avoid analysis, reducing the number of vendors receiving intelligence by 25%.

Questions about the veracity of threat intelligence from certain countries also threaten to reduce the amount of intelligence available globally. There needs to be a way to verify the history of threat intelligence data that documents the entire lifecycle of the data.

Enter: secure data provenance.

Data provenance summarizes the history of the ownership of the item, as well as the actions performed on it.

The idea with provenance in TI data is you should be able to tell where it was first observed, how it was analyzed (e.g. via static analysis, executing it in a sandbox environment, or manual review), how deeply it was examined, which independent parties validated the research, and how long each step took.

The new research shows that it’s technically feasible to create a formal, standardized and auditable provenance system that can track the path of TI data through various vendors and confirm the quality of the data, and vendors could filter data for ones that meet certain criteria such as data that’s been re-analyzed by domestic vendors to confirm the research.

You could also sanitize the provenance metadata to protect operational details of vendors.

The study found that vendors would delay sharing of data by hours or even days. The provenance data could reveal these bottlenecks.

Provenance data could also reveal when antivirus vendors fail to execute malware while researching, this would show up on the audit trail and incentivize vendors to perform deeper analysis.

When vendors simply re-share indicators of compromise or detection labels without performing their own independent analysis, it creates an illusion of independent consensus. Provenance data would reveal these patterns and incentivize them to perform their own deep analysis.

The pieces are all in place, we just need a push from cybersecurity companies to make it happen.

Community Discussion