The Congressional Research Service recently released a report (PDF, 688 kb, 17 pages, January 2016) describing the big data ecosystem for U.S. agriculture. The purpose of the report was to understand the federal government’s role in emerging big data sources and technologies involved in U.S. agriculture. As the report author, Megan Stubbs, points out, there is not even a standard definition of big data.
“Big data may significantly affect many aspects of the agricultural industry although the full extent and nature of its eventual impacts remain uncertain. …It is still unclear how big data will progress within agriculture due to technical and policy challenges, such as privacy and security, for producers and policymakers.”
The report divides up the agricultural big data ecosystem into two major sources.
- “Public-level big data represent records that are collected, maintained and analyzed through publicly funded sources, specifically by federal agencies (e.g., farm program participant records, Soil Survey and weather data).”
- “Private big data represent records generated at the production level and originate with the farmer or rancher (e.g., yield, soil analysis, irrigation levels, livestock movement and grazing rates).”
The rest of the report details the major actors in creating and using public-level big data and private big data. This mapping of the major data producers and consumers gives a high-level overview of how big data flows in the agriculture ecosystem. The report then discusses common issues that all the actors face such as security, privacy and ownership of the data. Benefits from the big data are also discussed. As the author observes, there is a mixture of public-level and private big data flowing through the ecosystem that further complicates policy issues as the agriculture industry continues to develop.
“Big data is a complicated topic, not only from a technological and analytical standpoint, but also from a legal, ethical and regulatory standpoint. The number of key players continues to grow, as does the list of benefits and challenges. As Congress follows the issue a number of questions may arise, including a principal one—what is the federal role?”
The same could be said of any industry touched by big data that is a combination of public-level data and private data. Mapping the actors and information flows of these various industries can help federal agencies decide how to release and distribute federal government open data. Federal open data policies can be tailored to help increase the benefits and reduce the challenges inherent in the numerous big data ecosystems.
—Update on a Previous Posting—
The December 16, 2015, Data Briefing dealt with the question of the U.S. government’s responsibility to educate users of federal open data. Recently, the Department of Commerce released the Commerce Data Usability Project (CDUP), which is a series of tutorials on how to use various Commerce datasets. You can read more about it on DigitalGov. The CDUP is a great example of how to provide tutorials for using federal open data sources.
Each week, The Data Briefing showcases the latest federal data news and trends.
Dr. William Brantley is the Training Administrator for the U.S. Patent and Trademark Office (USPTO)’s Global Intellectual Property Academy. You can find out more about his personal work in open data, analytics and related topics at BillBrantley.com. All opinions are his own and do not reflect the opinions of the USPTO or GSA.