AFAIU we are missing tokenizer tests for supported models like * Baichuan * Bloom * GptNeoX * Persimmon * Refact * Starcoder It would be great if anyone would be helping out.