Try:
Code:
-march=r12000 -O3 -fomit-frame-pointer -fno-math-errno -fno-rounding-math -fno-signaling-nans -funsafe-math-optimizations -fgcse-sm -fgcse-las -fipa-pta -ftree-loop-linear -ftree-loop-im -fivopts -fno-keep-static-consts
These options make some numerical code run 40% faster than MIPSpro's "-Ofast=ip35 -TARG:processor=r12000 -OPT:IEEE_arithmetic=3 -OPT:alias=typed". YMMV.
Also note that you shouldn't use "-Ofast" on gcc, since it enables "-ffast-math", which enables "-ffinite-math-only", which will break your code if it uses Inf and NaN (the Javascript interpreter probably uses them).
You can also try adding:
Code:
-fgraphite-identity -floop-block
to the above options. Then you can try adjusting "--param l1-cache-size", "--param l1-cache-line-size", and "--param l2-cache-size" for better performance. You can also try other Graphite-based optimizations, but they're still a bit experimental and can break the compiler
If the code doesn't use C++ exceptions, add:
Code:
-fno-exceptions -freorder-blocks-and-partition
(if it does, the compiler will warn you).
If the code doesn't use C++ RTTI, add:
Code:
-fno-rtti
(as above, if the code uses this, the compiler will warn you).
Also, link everything statically (this also improves program startup time).