Improve allocation cost for clang-cl

We need 40 seconds to compile Node.cpp in blink on native Windows
(Z620). However, it needed more than 4 minutes on wine + clang-cl on
Linux (Z620). With this improvement, compile of Node.cpp is just 40
seconds on wine + clang-cl, which is as fast as on native Windows.

We have 2 improvement points.

(1) Fix wine performance bug.
In the existing wine implementation sometimes search freelist from
one smaller class because of wrong memory size calculation. Arena in
such class does not fulfill the required size, so it always fail until
moving to next size class. Search unnecessary freelist made compile
really slow.

This improves clang-cl compile speed by 4x (4m -> 1m).

(2) Use fine grained free list class.
clang-cl allocates 200~500 bytes memory quite a lot, and often allocates
around 2000 bytes memory. So, for less than 512 bytes memory, we
introduce 16B grained freelist. For less than 4K bytes memory, we use
256B grained freelist.
Also remove loop from get_freelist_index() function to make it faster.
It's called a lot.

This improves clang-cl speed more (1m -> 40s).

BUG=b/34110848

Change-Id: I467140dc71185d7179ffb09426b39056644b3309
1 file changed