This is a mix of different types of voices. 1-11 seconds: sweet child or spoiled child voice. 11-17 seconds: different types of squeaky voices like a cute unknown creature or a Pokemon. 17-20 seconds: monster voice. 20-25 seconds: dog impersonations. At 32-40 seconds: sad story narrating.