We're experiencing some minor rendering bugs on our website that we hope to clear up shortly. If the dashboard appears cut off or isn't rendering to the full width of your browser, try reloading the page, or clicking on the "full screen" button on the bottom right hand corner of the dashboard.
Frequently Asked Questions
Q. Can I download the Data Breach Chronology Database?
A. Yes! You may purchase the Data Breach Chronology Database here.
We are committed to maintaining and improving this project over the long term. By purchasing this data, you are funding continued development of this resource. We currently offer temporary download passes (subject to our Terms of Service) for our database on a sliding scale.
- For organizations with an operating budget of under $1 million, the cost is $250 for a single download.
- For organizations with an operating budget of over $1 million and researchers with funded projects, the cost is $450 for a single download.
If you are working on unfunded research (including class projects) or for a nonprofit or media outlet on a limited budget, you can request a fee waiver. To request complimentary access for a limited time, please contact us at email@example.com and briefly describe your proposed use and affiliation. Our team will respond to you as soon as possible; however, we cannot guarantee that we will be able to respond in time for any deadline.
Q. What is the Data Breach Chronology
A: The Data Breach Chronology is a tool designed to help advocates, policymakers, journalists and researchers better understand reported data breaches in the United States. We launched the project in 2005 in response to the widely-publicized ChoicePoint incident, and it has evolved over time from a list of manually entered breaches to a robust database and visual dashboard.
Q. How is this project funded?
A: This project was funded in large part thanks to The Rose Foundation for Communities and the Environment Consumer Products Fund. We have also received funds for this project from cy pres awards and Consumer Federation of America.
If you are interested in supporting this project, please reach out to us at firstname.lastname@example.org.
Q. Is this a complete record of every data breach in the United States?
No. The data is comprised of publicly available information on reported breaches and should not be considered a complete and accurate representation of every data breach in the United States. It reflects breaches reported in the United States that are made publicly available by government entities.
Q. What are the next steps for this project?
In 2023, we are seeking additional funding to:
Develop an updated taxonomy of breach types and business types;
Further enrich the database with unique business identifiers where possible;
Better identify duplicate breach entries;
Regularly update the database; and
Develop additional visualizations and tools to help those working to advance data privacy and security for people.
If you are interested in getting updates on this project, join our email list here.
Q. How can I get involved with this project?
A. Thank you for your interest – there is no shortage of work that can be done to continue to improve this project, and there are many ways to join us in that endeavor!
- Donate your time and expertise as a data science or tableau volunteer to help us collect, clean, process, maintain, and present this resource. Contact us at email@example.com with the subject line “VOLUNTEER”.
- Apply for a legal internship to help us stay up to date on changing data security and breach notification laws.
- Apply to join our Data Breach Chronology advisory committee to help drive future project decisions and new features. Contact us at firstname.lastname@example.org with the subject line “ADVISORY COMMITTEE”.
- Donate to sustain the project.
Q. Something doesn't look right...
A. We're experiencing some minor rendering bugs on our website that we hope to clear up shortly. If the dashboard appears cut off or isn't rendering to the full width of your browser, try reloading the page, or clicking on the "full screen" button on the bottom right hand corner of the dashboard.
Data Breach Chronology Data
Q. What data makes up the Data Breach Chronology?
A. We collect data from publicly available, government-maintained data sources. This includes the U.S. Department of Health and Human Services and various state Attorneys General who publish data breach notices they receive under their states’ data breach notification laws.
Q. How far back does the data go?
A. Our historical database has been cleaned and normalized going back to 2005.
Q. What happens to the raw data after it is collected?
A We begin the challenging task of cleaning and normalizing the raw data so that it can be entered into a single, usable database. Our data through 2021 has been cleaned and normalized thanks to the Coleman Research Lab.
Q. How is the data processed to extract relevant information?
A. We use a combination of human and AI resources to process the raw data, extract relevant information, and apply classifications including type of breach, type of organization, number of records exposed in the breach, and relevant date information.
Q. How is AI involved in processing the data?
A. We use OpenAI’s GPT-3, a powerful language model, to help us classify the breach data. After processing, but before pushing any updates to our database, we pull a random selection of the database (~1%) to check the classifications against the available information.
Q. What are the labels for breach type and business type?
Type of Breach
CARD - Fraud Involving Debit and Credit Cards Not Via Hacking (skimming devices at point-of-service terminals, etc.)
HACK - Hacked by an Outside Party or Infected by Malware
INSD - Insider (employee, contractor or customer)
PHYS - Physical (paper documents that are lost, discarded or stolen)
PORT - Portable Device (lost, discarded or stolen laptop, PDA, smartphone, memory stick, CDs, hard drive, data tape, etc.)
STAT - Stationary Computer Loss (lost, inappropriately accessed, discarded or stolen computer or server not designed for mobility)
DISC - Unintended Disclosure Not Involving Hacking, Intentional Breach or Physical Loss (sensitive information posted publicly, mishandled or sent to the wrong party via publishing online, sending in an email, sending in a mailing or sending via fax)
UNKN - Unknown (not enough information about breach to know how exactly the information was exposed)
Type of Organization:
BSF - Businesses (Financial Services, Banking, Insurance Services)
BSO - Businesses (Manufacturing, Technology, Communications, Other)
BSR - Businesses (Retail/Merchant including Grocery Stores, Online Retailers, Restaurants)
EDU - Educational Institutions (Schools, Colleges, Universities)
GOV - Government & Military (State & Local Governments, Federal Agencies)
MED - Healthcare and Medical Providers (Hospitals, Medical Insurance Services)
NGO - Nonprofits (Charities and Religious Organizations)
UNKN – Unknown
Q. Have you considered revising the labels for breach type and business type?
A. Yes. We plan to address this when we secure resources for this project.
Data Breach Chronology Dashboard
Q. I have an idea for a new visualization, can I provide input for this project?
A. Yes! Please let us know at email@example.com and include “SUGGESTION” in the subject line.
Q. I believe a breach has incorrect information associated with it, how can I alert you?
A. Please email us at firstname.lastname@example.org and include “CORRECTION” in the subject line followed by the name of the breached business that you believe needs to be corrected. In the body of your email, please include the specific reported breach and the proposed correction so we may review.
Limitations and Disclaimers
The Data Breach Chronology is based on publicly available information and should not be considered a complete and accurate representation of every data breach in the United States. Rather, it reflects the data breaches that have been reported and made publicly available in the United States.
You should pay careful attention to the issue of duplicate reporting when making use of this data or making assertions based on this data. This version of the Data Breach Chronology does not identify when a single breach has been reported to multiple state Attorneys General. As of February 2023, there will be duplicated breaches in the database and reflected in the dashboard.
Additionally, though we have (where possible), scraped the contents of breach notification letters and include a link to the original PDF, we do not host these letters locally–and links may no longer be active.
Privacy Rights Clearinghouse makes no representations as to the accuracy of the information included in the Data Breach Chronology.