Skip to main content

[]

Intended for healthcare professionals
Skip to main content
Restricted access
Research article
First published online August 15, 2011

Science friction: Data, metadata, and collaboration

Abstract

When scientists from two or more disciplines work together on related problems, they often face what we call ‘science friction’. As science becomes more data-driven, collaborative, and interdisciplinary, demand increases for interoperability among data, tools, and services. Metadata – usually viewed simply as ‘data about data’, describing objects such as books, journal articles, or datasets – serve key roles in interoperability. Yet we find that metadata may be a source of friction between scientific collaborators, impeding data sharing. We propose an alternative view of metadata, focusing on its role in an ephemeral process of scientific communication, rather than as an enduring outcome or product. We report examples of highly useful, yet ad hoc, incomplete, loosely structured, and mutable, descriptions of data found in our ethnographic studies of several large projects in the environmental sciences. Based on this evidence, we argue that while metadata products can be powerful resources, usually they must be supplemented with metadata processes. Metadata-as-process suggests the very large role of the ad hoc, the incomplete, and the unfinished in everyday scientific work.

Get full access to this article

View all access and purchase options for this article.

References

Agre PE, Chapman D (1990) What are plans for? Robotics and Autonomous Systems 6(1/2): 17–34.
Atkins DE and National Science Foundation Blue-Ribbon Advisory Panel on Cyberinfrastructure (2003) Revolutionizing Science and Engineering Through Cyberinfrastructure: Report of the National Science Foundation Blue-Ribbon Advisory Panel on Cyberinfrastructures. Arlington, VA: National Science Foundation.
Bell G, Hey T, Szalay A (2009) Beyond the data deluge. Science 323: 1297–1298.
Berkley C, Blankman D, Brunt J, Gries C, Jones MB, Jones C, et al. (2010) Ecological Metadata Language (EML) Specification. Available at http://knb.ecoinformatics.org/software/eml/eml-2.1.0/index.html (accessed 13 April 2011).
Borgman CL (2007) Scholarship in the Digital Age: Information, Infrastructure, and the Internet. Cambridge, MA: MIT Press.
Borgman CL, Wallis JC, Enyedy N (2006) Building digital libraries for scientific data: An exploratory study of data practices in habitat ecology. Unpublished paper presented at the 10th European Conference on Digital Libraries, Alicante, Spain (17–22 September).
Borgman CL, Wallis JC, Enyedy N (2007) Little science confronts the data deluge: Habitat ecology, embedded sensor networks, and digital libraries. International Journal on Digital Libraries 7(1/2): 17–30.
Bowker GC (2000) Biodiversity datadiversity. Social Studies of Science 30(5): 643–683.
Bowker GC (2005) Memory Practices in the Sciences. Cambridge, MA: MIT Press.
Bowker GC, Star SL (1999) Sorting Things Out: Classification and Its Consequences. Cambridge, MA: MIT Press.
Braudel F (1975) The Mediterranean and the Mediterranean World in the Age of Philip II. New York: Harper & Row.
Buckland MK (1991) Information as thing. Journal of the American Society for Information Science 42(5): 351–360.
Buckland MK (1997) What is a ‘document’? Journal of the American Society for Information Science 48(9): 804–809.
Clark HH (1992) Arenas of Language Use. Chicago: University of Chicago Press.
Clark HH, Brennan SE (1991) Grounding in communication. In: Resnick L, Levine J, Teasley S (eds) Perspectives on Socially Shared Cognition. Washington, DC: Amerian Psychological Association, 127–149.
CMIP5 (n.d.) Coupled Model Intercomparison Project Phase 5 – Overview. Available at http://cmip-pcmdi.llnl.gov/cmip5/ (accessed 16 April 2011).
Collins HM (1985) Changing Order: Replication and Induction in Scientific Practice. London and Beverly Hills: Sage Publications.
Collins HM, Pinch T (1993) The Golem: What Everyone Should Know about Science. Cambridge: Cambridge University Press.
Dunlap R (2008) The Earth System Curator: Metadata Infrastructure for Climate Modeling. SIParCS Final Presentation, Boulder, CO, 4 August 2008 Available at www.earthsystemcurator.org/presentations/pres_0808_rocky.ppt (accessed 13 April 2011).
Dunlap R, Mark L, Rugaber S, Balaji V, Chastang J, Cinquini L, et al. (2008) Earth system curator: Metadata infrastructure for climate modeling. Earth Science Informatics 1(3): 131–149.
Edwards PN (2010) A Vast Machine: Computer Models, Climate Data, and the Politics of Global Warming. Cambridge, MA: MIT Press.
Elichirigoity F (1999) Planet Management: Limits to Growth, Computer Simulation, and the Emergence of Global Spaces. Evanston, IL: Northwestern University Press.
Galison PL (1996) Computer simulations and the trading zone. In: Galison PL, Stump DJ (eds) The Disunity of Science: Boundaries, Contexts, and Power. Stanford: Stanford University Press, 118–157.
Gray J, Liu DT, Nieto-Santisteban M, Szalay A, DeWitt D, Heber G (2005) Scientific data management in the coming decade. CTWatch Quarterly 1(1). Available at: www.ctwatch.org/quarterly/articles/2005/02/scientific-data-management/ (accessed 13 April 2011).
Haberl H, Erb KH, Krausmann F, Gaube V, Bondeau A, Plutzar C, et al. (2007) Quantifying and mapping the human appropriation of net primary production in earth’s terrestrial ecosystems. Proceedings of the US National Academy of Sciences 104(31): 12942–12947.
Hankin S, Blower JD, Carval T, Casey KS, Donlon C, Lauret O, et al. (2009) NetCDF-CF-OPeNDAP: Standards for ocean data interoperability and object lessons for community data standards processes. Community White Paper for Ocean Observations 9.
Hey T, Tansley S, Tolle K (eds) (2009) The Fourth Paradigm: Data-Intensive Scientific Discovery. Redmond, WA: Microsoft Research. Available at http://fourthparadigm.org (accessed 13 April 2011).
Hey T, Trefethen A (2003) The data deluge: An e-science perspective. In: Berman F, Fox G, Hey AJG (eds) Grid Computing: Making the Global Infrastructure a Reality. Chichester: Wiley, 809–824.
Hey T, Trefethen AE (2005) Cyberinfrastructure for e-Science. Science 308(5723): 817–821.
Hobbie JE, Carpenter SR, Grimm NB, Gosz JR, Seastedt TR (2003) The US long term ecological research program. BioScience 53(1): 21.
Jones MB, Berkley C, Bojilova J, Schildhauer M (2001) Managing scientific metadata. IEEE Internet Computing 5(5): 59–68.
Jones MB, Schildhauer MP, Reichman OJ, Bowers S (2006) The new bioinformatics: Integrating ecological data from the gene to the biosphere. Annual Review of Ecology, Evolution, and Systematics 37(1): 519–544.
Kevles DJ (1998) The Baltimore Case: A Trial of Politics, Science, and Character. New York: WW Norton.
Latour B (1987) Science in Action: How to Follow Scientists and Engineers Through Society. Cambridge, MA: Harvard University Press.
Lavoie BF (2004) The Open Archival Information System reference model: Introductory guide. Microform and Imaging Review 33(2): 68–81.
Lawrence BN, Lowry R, Miller P, Snaith H, Woolf A (2009) Information in environmental data grids. Philosophical Transactions of the Royal Society (A): Mathematical Physical and Engineering Sciences 367(1890): 1003–1014.
Maibach E, Wilson K, Witte J (2010) A National Survey of Television Meteorologists about Climate Change: Preliminary Findings. Fairfax, VA: Center for Climate Change Communication, George Mason University.
Mayernik MS, Wallis JC, Pepe A, Borgman CL (2008) Whose data do you trust? Integrity issues in the preservation of scientific data. Unpublished paper presented at the iConference, Los Angeles, CA (29 February).
Michelet J (1930) Oeuvres de Michelet 1, Autobiographie, Introduction à l’Histoire Universelle (Chabot H, trans.). Paris: Larousse.
Michener WK (2006) Meta-information concepts for ecological data management. Ecological Informatics 1(1): 3–7.
National Research Council (1997) Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: National Academy Press.
O’Brien K, Hankin S, Callahan J, Balaji V, Schweitzer R, Mclean J, et al. (2004) The GFDL data portal: A doorway to sharing model outputs. Unpublished paper presented at the American Geophysical Union, San Francisco (13–17 December).
Olson GM, Atkins D, Clauer R, Weymouth T, Prakash A, Finholt T, et al. (2001) Technology to support distributed team science: The first phase of the Upper Atmospheric Research Collaboratory (UARC). In: Olson G, Malone T, Smith J (eds) Coordination Theory and Collaboration Technology. Hillsdale, NJ: Lawrence Erlbaum Associates, 761–784.
Olson GM, Olson JS (2000) Distance matters. Human-Computer Interaction 15: 139–179.
Oreskes N (2004) Beyond the ivory tower: The scientific consensus on climate change. Science 306(5702): 1686.
Organization for Economic Co-operation and Development (n.d.) The Public Domain of Digital Research Data. Follow-up Group on Issues of Access to Publicly Funded Research Data. Available at http://dataaccess.ucsd.edu (accessed 16 April 2011).
PCMDI (n.d.) Program for Climate Model Diagnosis and Intercomparison. Available at www-pcmdi.llnl.gov/ (accessed 16 April 2011).
Pearce F (2010) Climate wars: Guardian special investigation. The Guardian. Available at www.guardian.co.uk/environment/2010/feb/09/climate-change-data-request-war (accessed 14 April 2011).
Reed S (2010) Oxburgh report clears controversial climate research unit. ScienceInsider. Available at http://news.sciencemag.org/scienceinsider/2010/04/oxburgh-report-clears-controvers.html (accessed 14 April 2011).
Sacks H, Schegloff EA, Jefferson G (1974) A simplest systematics for the organization of turn-taking in conversation. Language 50(4): 696–735.
Sanderson EW, Jaiteh M, Levy MA, Redford KH, Wannebo AV, Woolmer G (2002) The human footprint and the last of the wild. BioScience 52(10): 891–904.
Serres M (1995) The Natural Contract (MacArthur E, Paulson W, trans.). Ann Arbor: University of Michigan Press.
Serres M (2007) A return to the natural contract. In: Bindé J (ed.) Making Peace with the Earth: What Future for the Human Species and the Planet. Paris: UNESCO Pub.; Berghahn Books, 129–137.
Shapin S, Schaffer S (1985) Leviathan and the Air-Pump: Hobbes, Boyle, and the Experimental Life. Princeton, NJ: Princeton University Press.
Signell RP, Carniel S, Chiggiato J, Janekovic I, Pullen J, Sherwood CR (2008) Collaboration tools and techniques for large model datasets. Journal of Marine Systems 69(1/2): 154–161.
Star SL, Griesemer J (1989) Institutional ecology, ‘translations’, and boundary objects: Amateurs and professionals in Berkeley’s Museum of Vertebrate Zoology, 1907–1939. Social Studies of Science 19(3): 387–420.
Star SL, Ruhleder K (1994) Steps towards an ecology of infrastructure: Complex problems in design and access for large-scale collaborative systems. Proceedings of the 1994 ACM Conference on Computer Supported Cooperative Work. New York: Association for Computing Machinery.
Strathern M (2004) Commons and Borderlands: Working Papers on Interdisciplinarity, Accountability and the Flow of Knowledge. Wantage: Sean Kingston Publishing.
Suchman LA (1987) Plans and Situated Actions: The Problem of Human-Machine Communication. New York: Cambridge University Press.
Suchman LA (2007) Human-Machine Reconfigurations: Plans and Situated Actions (2nd edn). New York: Cambridge University Press.
Vaughan D (1996) The Challenger Launch Decision: Risky Technology, Culture, and Deviance at NASA. Chicago: University of Chicago Press.
Wallerstein I (1976) A world-system perspective on the social sciences.British Journal of Sociology 27(3): 343–352.
Wallis JC, Borgman CL, Mayernik MS, Pepe A (2008) Moving archival practices upstream: An exploration of the life cycle of ecological sensing data in collaborative field research. International Journal of Digital Curation 3(1): 114–126.
Wallis JC, Borgman CL, Mayernik MS, Pepe A, Ramanathan N, Hansen M (2007) Know thy sensor: Trust, data quality, and data integrity in scientific digital libraries. Unpublished paper presented at the 11th European Conference on Digital Libraries, Budapest, Hungary.
Wallis JC, Mayernik MS, Borgman CL, Pepe A (2010) Digital libraries for scientific data discovery and reuse: From vision to practical reality. Paper presented at the Joint Conference on Digital Libraries, Brisbane, Australia (21–25 June).
Wayne L (2005) Institutionalize metadata before it institutionalizes you. Reston, VA: Federal Geographic Data Committee. Available at www.fgdc.gov/metadata/metadata-publications-list (accessed 14 April 2011).
Zimmerman AS (2003) Data Sharing and Secondary Use of Scientific Data: Experiences of Ecologists. Unpublished PhD dissertation. School of Information, University of Michigan, Ann Arbor.
Zimmerman AS (2007) Not by metadata alone: The use of diverse forms of knowledge to locate data for reuse. International Journal on Digital Libraries 7(1): 5–16.

Cite article

Cite article

Cite article

OR

Download to reference manager

If you have citation software installed, you can download article citation data to the citation manager of your choice

Share options

Share

Share this article

Share with email
Email Article Link
Share on social media

Share access to this article

Sharing links are not relevant where the article is open access and not available if you do not have a subscription.

For more information view the Sage Journals article sharing page.

Information, rights and permissions

Information

Published In

Article first published online: August 15, 2011
Issue published: October 2011

Keywords

  1. collaboration
  2. communication
  3. data
  4. metadata

Rights and permissions

© SAGE Publications 2011.
Request permissions for this article.
PubMed: 22164720

Authors

Affiliations

Paul N. Edwards
School of Information, University of Michigan, Ann Arbor, MI, USA
Matthew S. Mayernik
Graduate School of Education and Information Studies, UCLA, CA, USA
Archer L. Batcheller
School of Information, University of Michigan, Ann Arbor, MI, USA
Geoffrey C. Bowker
School of Information Sciences, University of Pittsburgh, PA, USA
Christine L. Borgman
Graduate School of Education and Information Studies, UCLA, CA, USA

Notes

Paul N. Edwards, School of Information, University of Michigan, 3439 North Quad, 105 S. State St., Ann Arbor, MI 48109-1285, USA. Email: [email protected]
Paul N. Edwards is Professor of Information and History at the University of Michigan’s School of Information. His most recent book, A Vast Machine: Computer Models, Climate Data, and the Politics of Global Warming (MIT Press, 2010), was named a ‘2010 Book of the Year’ by The Economist. His research centers on the history, politics, and culture of information infrastructures.
Matthew S. Mayernik recently completed his PhD in Information Studies at UCLA. His dissertation, Metadata Realities for Cyberinfrastructure: Data Authors as Metadata Creators, examined everyday metadata practices of small-scale field-based research teams in seismology, ecology, aquatic biology, and environmental science. Mayernik is now a Research Data Services Manager at the University Corporation for Atmospheric Research (UCAR) in Boulder, CO, USA.
Archer L. Batcheller received his PhD from the University of Michigan School of Information in 2011, with a dissertation entitled Requirements Engineering in Building Climate Science Software. He is presently a fellow in the Future Technical Leaders program at Northrop Grumman.
Geoffrey C. Bowker is Professor and Senior Researcher in Cyberscholarship at the iSchool, University of Pittsburgh. His most recent book is Memory Practices in the Sciences (MIT Press, 2006). He studies emergent teams in cyberinfrastructure and emergent forms of knowledge expression in the sciences and humanities.
Christine L. Borgman is Professor and Presidential Chair in Information Studies at UCLA. Her most recent book, Scholarship in the Digital Age: Information, Infrastructure, and the Internet (MIT Press, 2007), won the Best Information Science Book of the Year award from the American Society for Information Science and Technology. Borgman’s research on data practices spans the domains of earth and space sciences, life sciences, computer science, engineering, and the humanities.

Metrics and citations

Metrics

Journals metrics

This article was published in Social Studies of Science.

View All Journal Metrics

Article usage*

Total views and downloads: 5738

*Article usage tracking started in December 2016


Altmetric

See the impact this article is making through the number of times it’s been read, and the Altmetric Score.
Learn more about the Altmetric Scores



Articles citing this one

Receive email alerts when this article is cited

Web of Science: 260 view articles Opens in new tab

Crossref: 326

  1. Can overseas R&D improve domestic industry-university-research institutions’ cooperation?
    Go to citationCrossrefGoogle Scholar
  2. Metadata and Semantic Research
    Go to citationCrossrefGoogle Scholar
  3. Self-Healing Databases for Emergency Response Logistics in Remote and Infrastructure-Poor Settings
    Go to citationCrossrefGoogle Scholar
  4. Government Forms as Friction: Identifying Opportunities for Innovation at the Intersection of Staff and Public Needs
    Go to citationCrossrefGoogle Scholar
  5. Coordinating uncertainty in the political economy of cyber threat intelligence
    Go to citationCrossrefGoogle ScholarPub Med
  6. Evolution of the “long‐tail” concept for scientific data
    Go to citationCrossrefGoogle Scholar
  7. The Incremental Growth of Data Infrastructure in Ecology (1980–2020)
    Go to citationCrossrefGoogle Scholar
  8. From standardising cultural data to coordinating data cultures: the history and politics of digital heritage aggregation in Australia
    Go to citationCrossrefGoogle Scholar
  9. Policing and Intelligence in the Global Big Data Era, Volume II
    Go to citationCrossrefGoogle Scholar
  10. On the Value of Informal Communication in Archaeological Data Work
    Go to citationCrossrefGoogle Scholar
  11. View More

Figures and tables

Figures & Media

Tables

View Options

Access options

If you have access to journal content via a personal subscription, university, library, employer or society, select from the options below:


Alternatively, view purchase options below:

Purchase 24 hour online access to view and download content.

Access journal content via a DeepDyve subscription or find out more about this option.

View options

PDF/EPUB

View PDF/EPUB

Full Text

View Full Text