This week I came across an opportunity to make the Llama model run faster using the beauty of AVX SIMD programming. Sometimes rethinking simple operations like matrix multiplications can bring about a lot of improvement. I have written down a detailed journal of how I went about modifying the matmul function to achieve that. #HighPerformanceComputing #HPC #AVX #SIMDProgramming #LLAMA2 #Optimization #LLMModel #CProgramming #DeepLearning #MachineLearning #ParallelComputing #Vectorization #PerformanceOptimization #ComputationalScience #ScientificComputing
Nice find. Did you submit a PR ?
Inspiring work!!
Good work Nevin!
SDE-II at Amazon | Ex-Mercedes Benz | Hacking for FinTech
1yInspiring!!, Nice optimisation you have done there. I hope you submitted a pull request for this! 😊