How to build a CVEDetails alternative website?

5 min readJan 7, 2021

Yes, a clearly question and I have tried to find for months. And if you don’t want to read whole of this topic to get the answer, please follows 4 steps bellow:

Step 1. Clone and setup follow the guide on CVEDataFeed repository
> git clone https://github.com/cuongmx/CVEDataFeed.git
Step 2. Create a mongodb, use something like mlab or MongoDB Atlas
Step 3. Setup environments and run command to import database from NVD
> python3 cvedatafeed.py importonline
Step 4. Build a frontend to browser all collection from the MongoDB (like https://cvedata.com)

So, if you also don’t want to build a frontend, please visit my new product https://cvedata.com

In this topic, I will try to take some keynotes about the CVEData building process which main target is how to cover all functions of CVEDetails.

0. A story

I’m working as a security pentest and security consultant. So, Docx coding (reporting) is one of some main tasks which I work everyday. And one of some interesting bug types which I like to report is “Using Components with Known Vulnerabilities” because my simple task is paste the link’s product at the cvedetails.com. In that way, I have been big fan of CVEDetails. However, one day, as usually, after pasting the link to my report, I sent to my customer and take a coffee. One moments, my kindly customer reply “Where are my CVEs on 2020?”

Oh man, there is not any update on CVEDetails from Nov 2019

Do something like a trending fan, I tried to find some alternative:

Google just show some popular sites which not like CVEDetails

Do thing like a big fan, I wrote email for the author (Mr Serkan Özkan) to try get update.

Mr. Serkan Özkan was pleased to reply to me and send me another good products (vulniq.com). However, I could not find a familiarity on it :(

So, in the email to Mr Serkan Özkan, I promised to build another site if he discontinues CVEDetails.

I am a respectful credibility person. So, about last month, I starting research about the CVEDetails. Fortunately, I found 2 keywords make this task be very possible:

**NVD data source** from Serkan Özkan’s slide on Blackhat 2012

1. NVD Datasource

NVD (National Vulnerability Database - https://nvd.nist.gov/) is the original datasource and fastest update about the CVE (not cve.mitre.org). Staff at NVD is very hard working, they release CVE update every 2h, including holiday (❤). And to make clearly, CVEDetails or CVEData or any CVE site, they just show data from NVD in difference ways.

To feed data from NVD, you just need download Json file from https://nvd.nist.gov/vuln/data-feeds (all CVE Data from 1999). And to keep update data, you need follow CVE-Modified and CVE-Recent. A note: CVE-Modified and CVE-Recent just stored recent data in 7 days, and you must keep run update job in every 7 days.

2. CPE Name

CPE (Common Platform Enumeration — https://nvd.nist.gov/products/cpe) is a naming scheme which is defined by NVD to unique system, software, packages as URI string.

For example:

cpe:2.3:o:linux:linux_kernel:2.4.7:*:*:*:*:*:*:* is used to define the Linux Kernel product, version 2.4.7 by Linux vendor, type is Operating system. CPE:2.3 is version of CVE.

3. Some others

There are some other problems which I resolved:

Vulnerability type

This is a very useful feature in CVEDetails. However, it’s not in original NVD data. As author, he matching keyword and CWE to classify.

Follow the author’s guide, I tried some statistic algorithm and got nice keywords and CWE set. I measure original classification in CVEDetails with my set, the results is about 99%

The comparison result, more details in github
#testFilter("exec code",[r"(code|command).*(execution|execute)", r"(execution|execute).*(code|command)"])
#out: 10552/10552
#testFilter("dos",[r"denial of service"])
#out: 8260/8260
#testFilter("overflow",[r"overflow", r"(restrict|crash|invalid|violat|corrupt).*(buffer|stack|heap|memory)", r"(buffer|stack|heap|memory).*(restrict|crash|invalid|violat|corrupt)"])
#out: 5242/5814
#testFilter("priv",[r"(gain|escalat).*privil", r"privil.*(gain|escalat)"])
#out: 1910/1910

CVSS v2 or v3

In fact, not of all CVE have just score in CVSS2 and CVEDetails just show vector string in CVSS2. This limit cannot show fully the complexity of CVEs. And in CVEData, I prioritize CVSS3 and when show in website, I will try convert some field in CVSS2 to 3.

privilegesRequired, userInteraction and scope are missing field of CVSS2

CVE Name

This missing thing in CVEDetails or other bug data site, I built it from CVEID, Vuln type and vendor, product affect.

4. CVEData architect

This is comunity project, so the cost the importance, there 3 points to choice architect:

Full automation, no need operation
Good Vendor, Good Infrastructure
Free or cheap

Detail configurations:

Protector, https: Cloudflare ~ free
Front-end: Django run on Google App Engine ~ free for 1000 hours/months :-S
Back-end: Google Cloud Functions run in Cloud Scheduler ~ free 3 jobs
DB: MongoDB Atlas, Free max 500MB data, total size about 700 MB, however I have voucher for 1year ~ free 1 year (hope CVEData live over 1 year :-P)
Monitor: UptimeRobot ~ free
Source repo: Github

5. Next step

I know now trending is threat intelligence. However, classic style (like dictionary) also has its value, at least with me. In next time, I have some ideas to continues:

Build bug trending to catch bugbounty trending $_$
CVE Awards: best cve, hotest cve, voting,…
Add more datasource to get CVE’s author and build Hall of Fame for CVE.

If you enjoy or have any other ideas, please let me know. Thank you!