I always say I have the best job in the world. I am the archivist for The Coca-Cola Company, and I protect and preserve artifacts that document our rich heritage. We have an amazing collection that has everything from original oil paintings by noted artists like Norman Rockwell, classic vending machines, reels of film with commercials, and even a 1939 delivery truck. We have over 2.5 miles of shelving in our collection storage area to hold the collection in a temperature- and humidity-controlled area. My profession (of which I am a proud member of the Academy of Certified Archivists) has spent decades researching the best way to preserve each of these types of artifacts, and we follow well-documented standards in our archives to ensure the collection survives.
On the other side of the spectrum, I always say that my greatest fear and challenge as an archivist is to capture and preserve born digital files and websites in particular. As a case in point, the only representation we have of our first website, “The Galaxy of Coca-Cola,” is a screen grab of the homepage when it was launched in April 1995. We are not alone. The Economist recently published an essay on digital preservation, where they noted the difficulty and revealed they are missing their first website as well.
You might wonder why this is important; the web changes every day, and in some ways it is like trying to capture a wave on the ocean. However, for a company and brand like Coca-Cola, our public expressions tell our story. Every print ad, television commercial or web experience we produce is an interaction with our fans. Over the past 15 years, our website has featured both our advertising and our corporate philosophy. As the steward of our heritage, I need to be able to capture these social expressions, just as I need to preserve a company painting or ad from 100 years ago.
The Internet Archives (or Wayback Machine as it has been called) began making a concerted effort to capture “the web” beginning in 1996, the year after we launched our first website. This nonprofit group began accepting contributions of archived websites to its collections from search engines and other private companies in 1996, though they did not make the content available until 2001. IA began crawling in earnest on behalf of itself and of cultural heritage and other memory institutions in 2003. While these early crawls serve a great purpose and will be critical to fulfilling IA’s mission to capture a “snapshot” of the web at any given time, we have discovered there are gaps in their collections that we are now working to fill. For some of our sites, I need a copy better than a snapshot.
After years of experimenting with various programs, in 2009 our archives began a concerted effort to crawl and capture a selection of our existing websites in a comprehensive fashion. We selected Hanzo Archives to capture over 20 of our brand or country websites from around the world. We chose Hanzo because they could offer full captures where the archived websites (saved on a quarterly basis) operate just as they did the day they were captured. All of the YouTube, Twitter, Facebook and other links will continue to work the way they did the day we did the crawl. With the increasing complexity of the web content and the growing use of flash and social media plugins, this concept of deeper, fuller crawls has worked well for us.
We are just about to enter the fourth year of our project, and the changes that we are documenting about our sites are astounding. Two of the crawls we have done have special meaning for me. We captured the site dedicated to our 125th anniversary in 2011 and our Olympic Games 2012 site this summer. We now have a record of what our company did during these two momentous occasions even though the sites have been taken down. While we are feeling better about our efforts, no single approach is perfect, and we continue to make modifications and changes to our plans as we continue to work in this very difficult space.
I hope that you will enjoy the brief video where we document the changes to The Coca-Cola Company corporate site over the years. You can get a feel for how the sites and technologies have changed over the years. You may note some errors on the early pages because the data captured from the Internet Archives was not complete.