Public Data is Coming in a Huge Way
The basis for InjuryLawyerDatabase.com is public data. We use lawsuit filings to publish statistics on injury lawyers and defendants. The vast majority of lawsuits are public. We use algorithms to scrape data from the State of Maryland websites that house the lawsuit data.
We believe the days of picking a lawyer from the TV, the YellowPages, or casual lawyer to lawyer referrals have peaked. In the future, consumers will use real data to make decisions.
But our project is just the tip of the iceberg. There are zettabytes of data in existence. It would probably take about a trillion thumb drives to catalogue it all. (Though an incredibly rough estimate, that’s actually not just hyperbole. One zettabyte is equal to 1 099 511 627 776 gigabytes.) Much of the data in existence, is public data. Governments have been the largest processors, collectors, and cataloguers of data.
Who Cares?
The ability to analyze large amounts of data will revolutionize your life, whether you are aware of it or not. Let’s just think about it from the legal perspective for one second.
- Imagine if you knew, in real-time, what complaints the FDA was getting. You’d know where the future lawsuits were before they do.
- Imagine if you knew the exact average sentence for a particular criminal case sorted by defendant age, race and geography, filtered by judge.
- Imagine if you knew when licenses were granted to your best client’s competitors, before your client did.
- Imagine if…
There are literally a virtually unlimited number of “imagine ifs” that could benefit you.
Now, a new start-up, Enigma (enigma.io) is making more of this possible. According to their website, the company has obtained databases from more than 100,000 public data sources in areas such as financial filings, lobbying, government spending, real estate, and even aircraft ownership. Enigma is promising the ability to search across different “silos” of data easily. For instance, you can already look up the most recent financial results for a company, what kind of real estate they own, what their assets are, what licenses they have, etc. But you would have to go to a myriad of different websites to see if the data was available, then get access to it if it wasn’t, then download, then analyze. Enigma puts it all in one place.
According to TechCrunch, much of Enigma’s data was found by filing Freedom of Information Act requests. They then presumably use the same methods we’re using for Maryland litigation – algorithms to scrape, parse and collate. The same TechCrunch article suggests that Enigma views the public data as a whole extra layer to the Internet. That’s exactly how I see it. In essence, all my company did was provide a bridge from Maryland’s litigation data housed by state government to the web as a whole.