Data Accuracy & Methodology

How we collect, match, and present public records data — and why transparency matters to us.

OpenDataUSA aggregates data from multiple public sources across federal, state, and county government databases to create a comprehensive, searchable platform of publicly available records. We understand that when people search for information on our platform, they need to trust that what they find is as accurate and complete as possible.

This page explains our data collection process, how we match records across different databases, the inherent limitations of public records data, and what we do to maintain the highest standards of accuracy and transparency. We believe that being open about our methodology is not just good practice — it is essential to earning and keeping your trust.

Data Collection

All of the data displayed on OpenDataUSA originates from government databases and public registries at the federal, state, and county levels. We do not create, fabricate, or infer data about individuals. Instead, we organize and index information that is already part of the public record.

Our primary data sources include:

  • Voter Registration Files: Publicly available voter rolls maintained by state election offices, containing names, addresses, party affiliation, and voting history where permitted by state law.
  • Property & Assessor Records: County-level property ownership data, assessed values, deed transfers, and parcel information from local assessor and recorder offices.
  • FEC Campaign Finance Data: Individual and organizational political contribution records filed with the Federal Election Commission, including donor names, employers, and contribution amounts.
  • USPTO Patent Database: Patent filings and grants from the United States Patent and Trademark Office, including inventor names, assignees, and filing dates.
  • Business Filings: State-level corporate registrations, LLC filings, trade name registrations, and annual reports from Secretaries of State offices.
  • Professional Licensing Boards: State-regulated professional licenses including healthcare providers (via NPI/NPPES), real estate agents, contractors, attorneys, and other licensed professions.
  • PPP Loan Disclosures: Small Business Administration Paycheck Protection Program loan recipient data, including business names, loan amounts, and forgiveness status.

We obtain this data through three primary channels:

Official APIs

Many government agencies provide programmatic access to their databases through official application programming interfaces, allowing us to retrieve data in structured formats.

Bulk Data Downloads

Federal and state agencies frequently publish bulk data exports on a regular schedule. We download and process these datasets as they become available.

FOIA Requests

When data is not readily available through APIs or bulk downloads, we file Freedom of Information Act (FOIA) requests to obtain public records from the appropriate agencies.

Update frequencies vary by source. Voter registration files are typically refreshed on a quarterly basis. Property and assessor records are updated monthly as counties publish new data. FEC campaign finance data is incorporated as it is reported by campaigns and committees. Other datasets, such as patent filings and professional licenses, are updated on rolling schedules as new records become available from their respective agencies.

Identity Resolution & Record Matching

One of the most significant technical challenges in aggregating public records is determining when records from different databases refer to the same individual. A person may appear in voter registration files, property records, FEC donation records, and professional licensing databases — each with slightly different formatting, abbreviations, or details. Our identity resolution system works to connect these records accurately.

Our matching process relies on combinations of the following data points:

  • Full Legal Name: We compare first, middle, and last names across databases, accounting for common variations such as initials, suffixes (Jr., Sr., III), and alternate spellings.
  • Address History: Current and historical addresses help establish whether two records with the same name likely belong to the same individual, particularly when someone has moved between jurisdictions.
  • Date of Birth Indicators: Where available, age ranges or birth year data provide strong matching signals. Not all public records include this information, but when present, it significantly improves match confidence.
  • Geographic Location: Proximity and geographic overlap between records help narrow matches, especially for common names in large metropolitan areas.

Common Name Disambiguation

Names like "James Smith," "Maria Garcia," or "David Johnson" appear thousands of times across national databases. Matching records for common names requires particular care. Our system uses a weighted scoring approach that considers multiple attributes simultaneously. A "James Smith" who appears at the same address in both voter records and property records with a consistent age range is far more likely to be the same person than two "James Smith" entries in different states with no overlapping attributes.

Confidence Scoring

Not all record matches carry the same level of certainty. Our system assigns a confidence score to each potential match based on the number and quality of overlapping attributes. Matches supported by multiple independent data points (name, address, and age range, for example) receive higher confidence scores than those based on name alone.

Critically, records that cannot be matched with sufficient confidence are not displayed on our platform. We would rather show incomplete results than risk attributing records to the wrong person. This conservative approach means that some legitimate connections between records may not appear in search results, but it significantly reduces the risk of false associations.

Data Accuracy & Limitations

We strive for accuracy in everything we present, but it is important for users to understand the inherent limitations of public records data. Transparency about these limitations is part of our commitment to honest, responsible data presentation.

Source Database Errors

Government databases may contain data entry errors, misspellings, or outdated information. Since we aggregate data as-is from these sources, any errors present in the original records will be reflected in our results.

Data Lag

There is always a delay between real-world events (moving, changing jobs, selling property) and when those changes are reflected in government databases. Records displayed on our platform may not reflect the most current information.

Name Variations

Nicknames, maiden names, hyphenated names, and legal name changes can cause gaps in record matching. A person known as "Bill" in one database and "William" in another may not always be connected correctly.

Possible Misattribution

In rare cases, records may be attributed to the wrong person who happens to share a similar name and geographic area. Our confidence scoring minimizes this, but it cannot be entirely eliminated.

Important: Users should always verify important information independently through official sources before relying on it for any purpose. OpenDataUSA provides aggregated public data as a starting point for research, not as a definitive source of truth.

FCRA Compliance Notice

OpenDataUSA is NOT a consumer reporting agency as defined by the Fair Credit Reporting Act (FCRA), 15 U.S.C. Section 1681 et seq. The data available through our platform may not be used, in whole or in part, as a factor in establishing an individual's eligibility for any of the following purposes:

  • Employment or hiring decisions
  • Tenant screening or rental housing decisions
  • Credit or lending decisions
  • Insurance underwriting or eligibility
  • Educational enrollment decisions
  • Any other purpose that would require compliance with the FCRA

By using OpenDataUSA, you agree to abide by these restrictions. Any use of our data for FCRA-regulated purposes is strictly prohibited and constitutes a violation of our Terms of Service. For complete details on permissible use of our platform, please review our Privacy Policy.

Data Removal & Corrections

We respect every individual's right to control how their information appears on our platform. If you find information about yourself on OpenDataUSA that you would like removed or corrected, we offer straightforward processes to address your concerns.

Request Removal

To have your information removed from our search results, visit our opt-out page and follow the instructions. Removal requests are free and do not require you to create an account.

Submit Opt-Out Request →

Report Inaccuracies

If you notice incorrect information in your records, please contact us with details about the error so we can investigate and make corrections where possible.

Report an Error →

We aim to process all removal and correction requests within 48 hours of receiving them. Once processed, the changes will be reflected in our search results immediately. Please note that removing your information from OpenDataUSA does not remove it from the original government databases where it is maintained.

For more information about your data privacy rights, including rights under state-specific privacy laws, visit our guide to data privacy rights.

Our Transparency Commitment

Transparency is a core principle at OpenDataUSA. We believe that a data aggregation platform has a responsibility to be open about what data it collects, where that data comes from, and how it is processed. Here are the commitments we make to every user:

Publicly Available Sources Only

We are committed to using only publicly available information. Every data point on our platform originates from a government database, public registry, or official open-records source. We do not purchase data from private data brokers, scrape social media profiles, harvest information from private databases, or access any data that is not already part of the public record.

No Social Media Scraping

We do not scrape, index, or display information from social media platforms including Facebook, Instagram, Twitter/X, LinkedIn, TikTok, or any other social networking service. The information on our platform comes exclusively from government and regulatory sources.

Source Documentation

We maintain detailed documentation of every data source category we use, including the originating agency, the type of data collected, and how frequently it is updated. You can review this information on our Data Sources pages, which cover:

Questions About Our Methodology?

If you have questions about how we collect, process, or present data, we are happy to provide additional detail. Reach out to our team anytime.