The document discusses the development of a novel algorithm called the Enhanced Position-Aware Sampling (EPAS) aimed at improving file similarity detection for large datasets, particularly in server environments. EPAS is designed to address the limitations of existing algorithms, such as high time overhead and sensitivity to file modifications, by concurrently sampling data blocks to maintain accuracy. Experimental results show that EPAS significantly outperforms traditional methods like Simhash in terms of efficiency and effectiveness in identifying Unicode data similarities.