At
Amazon, I developed digital media products for millions of Amazon customers. With
Search Inside the Book, I worked in the Media Technologies team to scale the digitized catalog to hundreds of thousands of books, and internationalized the interface for overseas locales. I also implemented an avant-garde
'popover' which shows information about a book when a user
mouses over the book's cover. Among other features leveraging this valuable text corpus, Amazon provides a list of
Citations derived from a book's text. I designed an algorithm to create a citation index and mine book text for citations, implemented this algorithm efficiently, constructed and managed a cluster of dozens of servers to process hundreds of Gigabytes of book text, adapted Amazon Distributed Hash Table technology to serve this citation corpus to hundreds of viewers per second, and finally implemented a simple web interface for these data. Both the popover and the citations functionality remain visible on one of Amazon's most visited pages, nearly 10 years after implementation.