Continuation from Part 1

Part 1 of this post explained how most common programming languages have the means to retrieve the 3rd party libraries using either in-built tools (in the case of NuGet for C#) or some useful plugins (such as the Gradle License Report or the pip-licenses package.

This helps in generating a Cybersecurity Bill Of Materials (CBOM) or C-BOM which serves as good housekeeping for many products. In the case of medical devices, it is essential when following necessary guidance documents for product submission.

The second part of this post expands on the use of vulnerability databases and describes the basics with automate as much tracking of CVEs for 3rd party libraries. This is by no means a simple task and there are commercially available tools on the market. Some suggestions if you’re interested are Veracode and Synk. My purpose is not to evaluate these tools here but I’m happy to give feedback if there’s interest.

From CBOM to Vulnerability search

The main challenge with automatic searching is taking the information that the CBOM report provides to a consistent approach of retrieving CVEs for each library. This is not a trival task. The author in reference [1] explains the problems with using the Common Platform Enumeration (CPE).

What are CPEs?

The reference in [2] gives a good definition of the CPE as a method that specifies a naming scheme for applications, hardware, devices and operating systems. There are two versions; version 2.2 and version 2.3 with different specifications using dictionaries to structure the library name, vendor, version information in a hierarchical format.

The problem that both [1] and [2] point out is that there are several limitations with CPEs even with the work that has gone into specifying the CPE standards. These can be summarised as follows:

  • Many open source tools / libraries are not contained within CPEs lists.
  • Various CVE feeds contain entries that do not contain a CPE identifier
  • As a result, it is challenging for a tool to fully automate all the necessary steps needed to search and find the CVEs related to software libraries in use.

The reference [2] goes so far as to describe a multi-step algorithm for searching for CVEs in a product using the CPEs of each CVE as well as finding relevant CVEs from summary information.

Basic Support for CVE searching

Armed with knowledge about the limitations of CPEs and the ability to search for CVE information, we will outline simplest possible do it yourself approach which could be could be extended over time.

Vulnerability Database

The post Tracking vulnerabilities in 3rd party tools outlined a number of public databases for searching vulnerabilities. This included the NVD, CVE Details and Circ.lu. A better one for automating retrieval and provided considerable detail is Vuldb.

The recommendation is to sign up and then go to the link on API which provides details on how to use the service. Note that there is a daily limitation on the number of credits available for API searching.

An example with curl is below. This command will retrieve the known vulnerabilities for Jackson JSON library in Java and specifically for version 2.9.0

curl -k --data "apikey=xxxx&advancedsearch=vendor:Fasterxml,product:Jackson,version:2.9.0" 
-X POST "https://vuldb.com/?api"  

Simply replace the xxxx with your API key.

Essential Approach

The basic approach requires a listing of the CBOM for each different language and a subsequent CURL or wget command line to the vulnerability database. In the case of Vuldb, the output is JSON form. It makes sense to normalize this data into a presentation which is more CSV like for comparison purposes.

A simple example in Python is to first save the information from pip-licenses using the command

pip-licenses --with-urls --format=csv > python-licenses.txt

Then a simple Python script such as the one below can be used to automate the retrieval of data from Vuldb.

# Constants - You need to insert your API key
API_KEY = "xxxx"
CSV_FILENAME = "python-licenses.txt"
VULDB_URI = "https://vuldb.com/?api"

import requests
import pandas as pd
import json
from pandas.io.json import json_normalize

if __name__=="__main__": 
	
	# Read file 
	df = pd.read_csv(CSV_FILENAME)
	headers = {'content-type': 'application/json'}

	# Iterate over rows in dataframe
	for index, row in df.iterrows():
		name = row["Name"]
		version = row["Version"]

		# Request data from vuldb
		data = {"apikey" : API_KEY, 
                        "advancedsearch" : "product:" + name, 
                        "version" : version}
		request = requests.post(VULDB_URI, data=data)
		if request.status_code == 200:
			response = request.json()
			if "result" in response:
				print(response["result"])
		else:
			print(request.status_code)
			print("Error retrieving data")
	

This script will iterate through the contents of the file obtained from pip-licenses and then print out the information for each line item.

Bear in mind that there are API limits for Vuldb and hence it may not work if you continue running it.

Conclusion

In conclusion, the main point here is that a home grown tool will need quite a lot of work to automate CVE retrieval for all possible scenarios as pointed out in [2]. It may be prudent to invest in commercially available tools since considerable effort is needed to refine the search procedure. Future posts will cover this in more detail.

References

  1. Veracode Blog, “Using CPEs for Open-Source Vulnerabilities? Think Again,” https://www.veracode.com/blog/managing-appsec/using-cpes-open-source-vulnerabilities-think-again
  2. L. Sanguino and R. Uetz, “Software Vulnerability Analysis Using CPE and CVE,” https://arxiv.org/pdf/1705.05347.pdf

Leave a comment