This document presents a performance analysis and implementation of a non-binary quasi-cyclic low-density parity-check (NB-QC-LDPC) decoder architecture. It introduces an efficient check node processing scheme that significantly reduces latency, achieving over 52% reduction compared to previous methods. The techniques are applied to a specific (620, 310) NB-QC-LDPC decoder, demonstrating improved coding gains and performance in wireless communications.