Some big news this week as President Obama further commits to opening up government data by signing an Executive Order. ProPublica also launched Nonprofit Explorer based on IRS tax return data that was recently released and discussions revolve around the weaknesses of big data and the need to improve data privacy.
Open Government Data Landmark Steps to Liberate Open Data
President Obama has signed an Executive Order to make government data more accessible to the public and entrepreneurs. It is hoped that this will spur more innovation and economic growth. Under the terms of the Executive Order and a new Open Data Policy all newly generated government data must be made available in open, machine-readable formats. This has followed in the footsteps of a host of government initiatives that encourage open data and reuse.
Nonprofit Data ProPublica Launches Online Tool to Search Nonprofit Tax Forms
ProPublica, an investigative-journalism nonprofit has launched a free online service called Nonprofit Explorer that enables the public to search the federal tax returns of more than 615,000 nonprofits. The IRS recently released structured data from the tax returns of almost 616,000 tax-exempt organizations. Data on executive compensation, revenue and expenses, are available and downloadable from as back as far as 2001.
Big Data Think Again: Big Data
This post in Foreign Policy calls for a look at the weakness of big data rather than just the strengths. Kate Crawford says that numbers can't speak for themselves, and datasets no matter how big they are subject to bias and blind spots. According to her, there is a problematic belief that bigger data is always better data and that correlation is as good as causation. For example, though we glean insight from tweets only 16% of online adults in the US use Twitter and many accounts are "bots," fake accounts, or "cyborgs" (human-controlled accounts assisted by bots). Tweets also tend to come from younger and more urban sections of the population.
Has Big Data Made Anonymity Impossible?
Patrick Tucker talks about the way that anonymity is becoming mathematically impossible as digital data increases and expands in this MIT Technology Review post. Much of this data is invisible to people and may seem impersonal even though it is not. Modern data science has found that nearly any type of data can be used, much like a fingerprint, to identify the person who created it. The greater the amount of personal data that becomes available, the more informative the data gets and with enough data, it’s can be possible to predict a person’s future.