In Web Development
How to make TypeScript Blazingly Fast (JavaScript/NodeJS) - read the full article about Node Js update, Web Development and from ThePrimeagen on Qualified.One

Youtube Blogger

gatsby blazingly fast seven react libraries blazingly fast blazing fast javascript the promise of every javascript library but youre probably asking yourself how do you make fast that which is slow astute question and observation so you cant really i mean technically javascripts slower than a lot of other languages but you can definitely make it fast in comparison to itself blazingly fast but how do you make it faster well its actually a little bit surprising some of the techniques you can use to make your javascript applications faster so im going to walk you through how to actually do it how to actually make javascript basically fast so what did we build and by we i mean me building the actual application and of course degenerates on twitch explaining to me what ive done wrong and what i should be doing wow so zig has zero macros and zero meta programming yet its still powerful enough to express any complex program in a simple and concise manner what are you a walking pamphlet get the hell out of here so what i did build was actually a pretty simple game server that runs at 60 ticks per second and we measured how successful we were at maintaining 60 ticks per second the game starts off with two players each are given a fire rate its obvious that one of the fire rates is slower therefore one of the players is going to lose every single time after receiving a ready up command and a play game command the players will start firing at each other every 200 milliseconds which means that the slower firing rate will be disadvantaged and eventually will lose the game the game looks like this we start with the ready state once both players are ready and started firing we enter into a loop we check for any fire commands and create any new bullets if needed we check for collisions and move the bullets according to how much time has passed since the last loop then we check for an end state to the actual game so this should give us a pretty robust place to be able to test out some performance improvements so lets get started lets try to make this application a bit more blazingly fast no karen blazingly fast is not a euphemism i mean i guess it kind of i mean i get i get it kind of is yeah okay reasonable i will stop saying that you win karen ill update my vocabulary blazingly fast the must-do first step to improving any performance is to get a baseline measurement what are you trying to improve if you recall from earlier this was our game loop now every single time it runs we may get some good loops but sometimes we get some bad loops meaning something happened that caused it to run a little longer than expected and of course we can take all these times and we can actually bucket them together therefore there are seven good loops there are two uh you know kind of exceeding ones and then one that gets even further exceeded now with the actual data that i was capturing i had many more buckets to fill in and of course we would have somewhere between 200 concurrent games to 800 concurrent games being played at once on a linode.com/prime single cpu instance so that way it was, you know, a fairly repeatable experiment a little side note you shouldnt run your experiments on your own computer just because your computer has a lot of things happening so i always have an instance that is at least more quiet and more predictable than my computer with the base implementation no performance improvements at all i had about 96.6% of the ticks actually considered good ticks within 16 to 17 milliseconds the rest after that were comprised of mostly 19 to 20 milliseconds after that a very small percent for anything above that once we increased to 400 concurrent games or about 800 active players the percentage dropped quite a bit even more so once we got up to 1200 and at 1600 less than half of the ticks even were considered good so the typescript server fell apart pretty hard as it got under load so now we have a goal just simply make this blue bar bigger again karen not a euphemism or just trying to make a big girthy blue bar okay totally normal totally normal one of the easiest ways to understand whats happening in your server is to do a flame graph now with node its pretty simple you pass a simple flag --perf-basic-profs run perf throw it through the little perl script and of course generate the flame graph and boom you got yourself a flame graph now how do you read this thing well effectively how perf works is its like taking a stack trace hundreds or thousands of times a second it takes all those stack traces sorts them so that the x-axis isnt meaningful in a time sense meaning you couldnt say okay this one call over here is 45 milliseconds no what it is instead is its proportions so when i look at this i can see that right here this is 45% of my program i am within this function so its actually a really easy way to see what is taking a long time to run now when i look through a lot of these peaks what im going to see is a lot of like e-poll so probably not going to improve that part of my program i see tcp probably again not something im going to improve node internals when it comes to firing timers node internals when it comes to reading streams all of these im probably unlikely going to improve this running time my application really consists within this stack right here so lets look at this top one specifically it is called collisions its taking 20 percent of the time when i can do a little control find i can do collisions and in case it appears else in the stack ive raised myself up just so you can see this down below this says 21.5 percent of the time i am in this function i know what function this is and i know probably how to make this faster so this looks like a great place for me to try to improve the reason being is that i just did a simple n squared algorithm take every single object and just compare it to the rest of them so what i could do is break up player ones bullets to player two bullets and only test against each other because player one bullets will never collide with each other player two bullets will never collide with each other its only this direction so easy win all right so i re-ran the program and what i see is that my collision went from 21 and a half percent of the time to only eight percent of the time this means we should see an effect on this we should actually see our program run faster at least i would assume so at this point i made a pretty decent sized improvement here youll notice that this side of the graph actually got larger meaning that we have more time to be able to do operations that i cant quite control their running time by reducing the amount of time that i can control so i re-ran the exact same experiment and what i see is a dramatic improvement in the amount of blue ticks in both the 800 connection and the 1200 connection comparing it i see about 62.4 percent of the time im in the blue bar section or the well-behaved ticks 16 to 17 millisecond in the original server versus 72 percent of the time so this is definitely better all right so if you dont know how to run perf i will include the commands in the readme on the repo https://github.com/ThePrimeagen/tyrone-biggums do that free five oclock crack giveaway again yes yes twitch did choose the name okay stop judging me all the links will be down below were you surprised at how much of an improvement we got that was a decent 10% improvement by fixing something that was pretty trivial to fix often what youll find is that youve made some pretty simple mistakes when you look at a flame graph but lets try to improve our server but on a completely different dimension something that might surprise you you can run node with the --inspect flag and youll be able to see a memory tab in the chrome debugger by going to allocation sampling you can measure just like perf traces memory being created instead of cpu being used theyre identical in how to read the graphs so this one you read it of course as this takes up a lot of memory this takes up some memory this takes up a lot of memory the total amounts of memory probably dont really matter because its not quite correct its a sampled amount but its again you just look at portions and can you improve it proportionally youll notice right here that looks kind of crazy about 33 percent of the time were generating all of our memory from a place called processMessage now what process message does is it takes in a fire command from the websocket creates a bullet and adds it to the world i was a little bit surprised seeing that and when we look at our flame graph only 2.7% of the time are we in this function so its not like its a heavy cpu function but it is a heavy memory generation function along with that we generate a lot of memory just listening to sockets so if we could improve that along with how we keep stats for game the players we create we could see a much better performance in our server so were going to improve this by creating memory pools effectively a memory pool allows you to cache an object and reuse it over and over again since im going to create literally thousands of bullets a second i should probably just have a pool these short-lived objects can be created once and just used over and over again if i can just simply reset their position and their direction so to create a memory pool i created a very simple ring buffer if youre unfamiliar with the ring buffer ring buffer allows you to have a fixed size array where you have two pointers one where you need to insert at and one where you need to remove at and as you insert and remove they go around eventually you could have your insertion catch up to your removal at that point you need to create a bigger area to store your items or the other direction your removal catches up to your insertion then you just need to create a new object to be held by the memory pool now the dangerous part about doing this is you have to manage your own memory meaning you cant just let the bullet be garbage collected you need to manage its lifetime (RUST BABY) when i create a bullet i actually have to remove it from the pool update all of its positions and set its direction and every single time i need to remove a bullet i actually have to call the cleanUp method and make sure that the bullet re-adds itself back to its own pool which also means when we tear down our world i need to remove all the bullets and go one by one through the list along with removing the lists themselves releasing the players doing all of that so this can actually become quite cumbersome and easily you can mess this up but the effects are incredible so heres the previous one now notice if you look at this graph this section pretty much over here represents ws the blazingly fast websocket library and you can see some more of it right in here as well so about 40 percent of it is comprised of just ws memory and stuff when we look at the new version youll notice that process message becomes nothing i do have to json parse so of course thats gonna have some effect but on top of it the ws memory thats being used is now like 66 perof the program weve really slimmed down our program to be as optimal as possible now before i show you the results what do you think is going to happen will we actually be faster or is all this memory management not gonna improve anything because you could defeat the compiler right jit may not be able to run i have no idea the javascript engine is a very complicated piece of machinery what you think may make it faster may make it slower so just to remind you about 64.2 percent of the time we had good frames at 1200 connections in our base implementation now when we look at our memory pool we actually see that 74.7 percent of the time we have good ticks this is actually a bigger improvement than our hot spot implementation for me this was kind of surprising i did not think it would be a better improvement it didnt really show up in flame graphs the way i thought it would i thought the fix was kind of obvious so for me this was a bit surprising how good it was and something else that was actually kind of surprising is that youll notice that the memory used our base implementation was getting up over 200 megabytes at 1200 players versus right around 110 megabytes for our memory optimization look at the memory much smoother barely grows the average is literally twice as much for our base implementation versus our memory optimized all right so for the final experiment i played a hundred thousand games with the base implementation versus the final implementation both the memory and the hotspot improvement to compare the differences of course when we get into the final one right here were looking at 46.1 percent of the frames are within our acceptable range that means over half of the frames were 18 milliseconds or higher which is actually pretty surprising like our server was really slowing down all right so lets look at the exact same graph at 100 000 games played 54.1 percent of the time the ticks were good which is actually an over 20 percent improvement compared to the base implementation at 1200 connected players its 75 percent versus 64 percent another huge improvement okay yeah come on in twitch okay yeah oh okay yeah youtube yeah you can come in too i get it you guys want Rustlang and golang yeah well heres the deal like i said last time give this video a like do a little comment let me know that you like this content because once again if you guys dont interact with me how am i supposed to measure the improvement here okay come on as you can see i like graphs so if youre not providing me the information i cant graph it and of course youtube you want to see this created live you gotta head on over to twitch i mean look at twitch what a nice character huh you like him well guess what you can hang out with twitch all day long
ThePrimeagen: How to make TypeScript Blazingly Fast (JavaScript/NodeJS) - Web Development