1) The document describes a modified distributive arithmetic based discrete wavelet transform (DWT) processor architecture and its FPGA implementation for image compression.
2) The proposed architecture uses four lookup tables to store pre-computed partial products of filter coefficients, achieving a latency of 44 clock cycles and throughput of 4 clock cycles.
3) A software reference model is developed in Matlab to analyze the performance of various wavelets for image compression using the distributive arithmetic based DWT approach. The input image is resized and decomposed into sub-bands using DWT and reconstructed using IDWT.