Communication between arduino and esp8266-01

I don't think it's a perfect method but there should be no reason why almost the same should not work on the ESP unless there is a hardware issue.
Main thing is that transmission at 9600bps takes about 1 ms / byte, and reception is much faster.
So far you have succeeded because the whole string has been transmitted (and received) before parsing starts.
There are loads of examples for Serial transmission and reception to be found, many of which are none-blocking in nature, this is what you want.
have a look at Serial input basics