I always say I have the best job in the world. I am the archivist for The
On the other side of the spectrum, I always say that my greatest fear and challenge as an archivist is to capture and preserve born digital files and websites in particular. As a case in point, the only representation we have of our first website, “The Galaxy of
You might wonder why this is important; the web changes every day, and in some ways it is like trying to capture a wave on the ocean. However, for a company and brand like
The Internet Archives (or Wayback Machine as it has been called) began making a concerted effort to capture “the web” beginning in 1996, the year after we launched our first website. This nonprofit group began accepting contributions of archived websites to its collections from search engines and other private companies in 1996, though they did not make the content available until 2001. IA began crawling in earnest on behalf of itself and of cultural heritage and other memory institutions in 2003. While these early crawls serve a great purpose and will be critical to fulfilling IA’s mission to capture a “snapshot” of the web at any given time, we have discovered there are gaps in their collections that we are now working to fill. For some of our sites, I need a copy better than a snapshot.
After years of experimenting with various programs, in 2009 our archives began a concerted effort to crawl and capture a selection of our existing websites in a comprehensive fashion. We selected Hanzo Archives to capture over 20 of our brand or country websites from around the world. We chose Hanzo because they could offer full captures where the archived websites (saved on a quarterly basis) operate just as they did the day they were captured. All of the YouTube, Twitter, Facebook and other links will continue to work the way they did the day we did the crawl. With the increasing complexity of the web content and the growing use of flash and social media plugins, this concept of deeper, fuller crawls has worked well for us.
We are just about to enter the fourth year of our project, and the changes that we are documenting about our sites are astounding. Two of the crawls we have done have special meaning for me. We captured the site dedicated to our 125th anniversary in 2011 and our Olympic Games 2012 site this summer. We now have a record of what our company did during these two momentous occasions even though the sites have been taken down. While we are feeling better about our efforts, no single approach is perfect, and we continue to make modifications and changes to our plans as we continue to work in this very difficult space.
I hope that you will enjoy the brief video where we document the changes to The