Hi there, I'm Dara 👩👋
I am a machine learning engineer, and an aspiring AGI researcher.
My work primarily evolves around foundation multimodal models, large-scale pre-training, and inference optimization. I am currently deep-diving into HPC for deep learning, writing custom GPU kernels in Triton that leverage tiling, shared memory, and parallel execution to overcome the memory wall. My focus is on accelerating DL training and inference through IO-aware kernel design.
I am concerned about AI safety and interpretability so I occasionally do some mechnaistic intrepretability probing and write about some of my findings here
- Infuse audio: A framework for aligning audio representations with the embedding space of LLMs (multimodality)
- Ablate compliance: Finding jailbreak directions within the activation subspace of a LLMs
- Flash Attention and Diffusion Kernels in Triton: Highly performant, highly optimized flash attention kernels, linear attention and diffusion models kernels
- Upcycle MoE: A framework for upcylcing any dense model to a sparse Mixture of expert arch
- You can reach me via email
- I regularly write about deep learning and GPU programming Blog
- Connect with me on Linkedin and X
Learn more about my experiences Link