This paper presents a novel FPGA-based Canny edge detector algorithm that utilizes an 8-bin non-uniform gradient magnitude histogram for block-based hysteresis threshold calculation, thereby optimizing area requirements while reducing memory usage, latency, and computational complexity. The hardware architecture was successfully implemented on a Xilinx Virtex-5 FPGA and simulated with modelsim, demonstrating efficient real-time edge detection capabilities across various image resolutions. The proposed method yields comparable edge detection results to the original Canny algorithm but at a significantly lower computational cost.