Digital Preservation 2012: Day 1
Digital Preservation 2012 was an enlightening and engaging experience for all WSU NDSA student members involved! The main conference opened with remarks by Martha Anderson of the Library of Congress. She stressed that we must recall the NDSA’s journey of the last two years, particularly the group’s growth through leadership and generosity. Additionally, the values that defined NDSA when it began two years ago, “stewardship, collaboration, inclusiveness, and exchange” continue to guide us forward. She then turned the MC duties over to Bill LeFurgy, also of the Library of Congress, who in turn introduced Anil Dash.
Dash’s background is in the tech industry, and he is the cofounder and CEO of ThinkUp, a startup that allows people to capture the information they post on social networks. He is also the Founding Director of Expert Labs, whose mission, to broadly engage people and large social media, fits in nicely with that of the NDSA. His presentation discussed the challenges of preserving user-generated content on the web, particularly content contributed to social networks. These social networks, including Facebook, typically have anti-open web Terms of Service (TOS) agreements that users must agree to before participating. Dash stated that these types of TOSs result in “ordinary conversations [that] are disposable” because the TOS gives the company the right to modify and delete user content at anytime. Although prospective social media users may opt out, they’re missing out on the social media party held at that app. This can result in “severe social and career cost.”
While Dash concedes that the TOS problem only directly afflicts portions of the web, he argues that the reality is that these types of TOSs are an integral part of the “war [that] is raging against the open web.” He supports this by stating that the “behavior of people on the web” includes spending the bulk of their time “within streams (big networks) rather than webpages.” He charges that companies requiring users to agree to TOSs are gas lighting the web, resulting in “ideas locked into apps will not survive an acquisition” as “administrators can remotely delete stuff on your phone.”
In addition to the obvious threat that TOSs pose to user-generated content, Dash cited several other challenges that threaten the open web, and the preservation of its content: using wrong file types, lack of metadata, and content tied to devices. These are of particular concern to those engaged with digital preservation, as Dash stated that companies are “bending the law to make archiving illegal. Companies, especially those with hardware (e.g., Google and Apple) are defending and implementing DRM. You’re not allowed to archive your own stuff.” But Dash offered hope and a call to action. First he reinforced that “these people aren’t bad, they are ill-informed.” There is also the technological reality that the web works by making copies. Dash urged that this is a feature we all must leverage to ensure content is archived. He also gave examples of apps that “do the right thing” like Time Hop and Brewster. He stated that “PR trumps TOS” and that users, particularly those of us in the NDSA are the advocates- and must call on creators to do the right thing.
Weinberger, a Senior Researcher at Harvard University’s Berkman Center, and co-author of The Cluetrain Manifesto, discussed the relationship between and evolution of information and knowledge. He indicated that “information was invented to be managed in the computer age” and asserted that “systems are based on limitations,” for example, systems are often written for the minimum amount of necessary information to be inputed.
Weinberger asserted that knowledge is “always about filtering,” and that knowledge has “been agreed upon, [and] drives out differences.” However, knowledge has evolved over the course of time, and previously “to know something is to know its single place in the universe,” but now the new medium of knowledge is “becoming networks and lives on networks, and taking on the qualities of networks.” Weinberger concludes that in the “age of the Internet, [there are] knowledge networks, [and] knowledge is unsettled” resulting in “the net exploring a long hidden truth- we don’t agree.”
Indeed, according to Weinberger it is this disagreement that “scales knowledge – it’s the connections between disagreements.” He stated “data that was supposed to be atoms of information, they are instead links and connections to other things – a data commons – every release of data makes it more valuable.” He concluded by asserting that, “it’s not that there’s too much information or information overload, it’s that the world is so big.”
A member of the audience asked both speakers the ultimate question surrounding digital archives- what is it that archives should do? Should archives attempt to save everything? Dash answered that there is no such thing as “everything” on the web because of how frequently things change. Instead archives have to make choices. Weinberger had a slightly different position, advocating that rather than making decisions, make algorithms, resulting in materials being saved based on a series of rules rather than individual choice.
Carroll’s presentation, “Copyright and Digital Preservation: The Role of Open Licenses,” opened by comparing the problem of saving information with environmentalism, aptly coined “information environmentalism.” The analogy is well crafted as there is a “stewardship of valuable resources,” “long-term risk analysis,” “depletion/destruction,” and “access/use” He asserted that people are “so focused on what’s new/next, and not thinking for the long term.” This led to a discussion on the role of copyright within the current information and digital environment:
- Rights are an intangible layer
- Workflows may be different
- Used to have to opt-in to copyright
- In a world of automatic copyright
- Concerns over a potential for Congress to give back copyright to US authors
- Attempt to chill Hathitrust
- Argument of fair use versus section 8, only able to use one of the other?
Thus, the legal environment is far from static, instead constantly evolving the nature and authority of copyrights.
Carroll argued that “making copies for the purpose of fair use should be legally OK,” and went on to discuss digital rights management, presenting the NDSA (and broader community) with two asks:
Ask #1: “Can the preservation community organize itself to be the voice of tomorrow’s users on issues of copyright policy and copyright estate planning?”
Contingency plans need to be made for when “databases go dark,” “copyright term extensions or “restorations,”” or policy makers such as Google and Facebook (so termed because they have TOSs) exert content controls. An alternative rights management is the Creative Commons which was “inspired by open source” and has the conditions of attribution, share alike, noncommercial, and no derivatives. While at present there is no way to know how many active cc licenses there are (Carroll explained that only link backs to the licenses can be enumerated), there are many out there, and in addition to being active in the US, there are 72 Creative Commons “Affiliate” teams located in other countries.
Ask #2: “Can the preservation community promote copyright?”
Carroll provided the following suggestions
- Mark the digital public domain. This will allow people to see what is in the public domain and be aware of its existence.
- Encourage use of open licenses at the time of publication, if not possible, embed copyright licensing in longterm contingency planning (e.g., LOCKSS)
- Consider the use of “springing” open license – this helps diminish the orphan works problem.
Bram’s presentation, “Assuring Future Access, from Infancy to Maturity,” discussed maintaining access of digital content. He began by presenting “who cares about what,” in terms of the following:
- Data curation and policies
- Ownership and licensing
- Selection (politicized)
- Access and usage (long term access = preservation)
He compared expectations with reality such as the “expected outcome from funded projects and the real outcome,” and what we expect from tools and services, but what we actually get are “orphans and abandoned data.” This lead into Bram discussing “long term access challenges”:
- New technologies and formats
- Unsupported legacy software
- Growth of data
- Data quality
Despite these challenges, Bram asserted that “a mature learning organization can assure long term access” through the right choices in both projects and within the organization, are listed below:
- Best Practices
- Having the right staff
- Retention of staff
- Development of staff
- Career paths
- Challenges and incentives
Following Bram’s presentation was a series of lightning talks, discussing a wide range of subjects pertinent to digital preservation. These included
- Christie Moffatt, National Library of Medicine – Developing a “Health and Medicine Blogs” collection at the U.S. National Library of Medicine
- Terry Plum, Simmons GSLIS – Teaching Digital Preservation in a Digital Curriculum Laboratory
- Daniel Krech, Library of Congress – Sets, Hypers, and Yarn
- Kelcy Shepherd, Amherst College – Our Collective Task:Digital Preservation at the Five Colleges
- Jefferson Bailey, Library of Congress – Personal Digital Archiving at the Library of Congress
- Carol Minton Morris, DuraSpace – Painting Crowdsourced Microfinance Platforms and Projects Into the Big Digital Preservation Picture: The National Digital Stewardship Alliance (NDSA) and Kickstarter
- Kristopher Nelson, Library of Congress – National Digital Stewardship Residency: Preserving Our Digital Future
- Moryma Aydelott, Library of Congres – Tackling Tangible Media
Reception and Poster/Demo Session
The day concluded with the reception and poster/demo session, including two posters from our WSU NDSA Student Group! The first was from group members Camille Chidsey, Laura Gentry, and Lisa Phillips, entitled Preserving Your Past: Creating a Public Service Announcement Video on Digital Preservation. The poster detailed both the formation and activities of our group thus far as well as the creation of our proof of concept PSA on digital fragility.
The second poster, Archival Description Applied to Digital Formats and Digital Preservation: Maintaining Metadata throughout the Digital Preservation Process was created by Alexandra Orchard. This poster was a shortened version of the paper she wrote earlier this year. The poster covered the following: Technical progress has resulted in greater dependency on metadata in archival description as materials become more readily available in electronic formats, yet standard methodologies to solve issues in digital preservation have not yet been established. Traditional theoretical tenets and practice have been affected – as now archival material and description (i.e., metadata) must be preserved, ensuring their authenticity and reliability, resulting in new potential theoretical models such as digital curation and digital historiography.