This document outlines the design and implementation of a search engine prototype that processes millions of Wikipedia XML pages to retrieve the top 10 relevant documents for user queries. It covers the construction of an inverted index and the necessary text preprocessing steps such as tokenization, stop word removal, and stemming. Additionally, it details both DOM and SAX parsers for reading XML, emphasizing the efficient handling of large datasets.