Skip to content
View MyDarapy's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report MyDarapy

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
MyDarapy/README.md

Hi there, I'm Dara 👩👋

I am a machine learning engineer, and an aspiring AGI researcher.

My work primarily evolves around foundation multimodal models, large-scale pre-training, and inference optimization. I am currently deep-diving into HPC for deep learning, writing custom GPU kernels in Triton that leverage tiling, shared memory, and parallel execution to overcome the memory wall. My focus is on accelerating DL training and inference through IO-aware kernel design.

I am concerned about AI safety and interpretability so I occasionally do some mechnaistic intrepretability probing and write about some of my findings here


My Work

  • Infuse audio: A framework for aligning audio representations with the embedding space of LLMs (multimodality)
  • Ablate compliance: Finding jailbreak directions within the activation subspace of a LLMs
  • Flash Attention and Diffusion Kernels in Triton: Highly performant, highly optimized flash attention kernels, linear attention and diffusion models kernels
  • Upcycle MoE: A framework for upcylcing any dense model to a sparse Mixture of expert arch

Get in touch

  • You can reach me via email
  • I regularly write about deep learning and GPU programming Blog
  • Connect with me on Linkedin and X

My Resumè

Learn more about my experiences Link

Pinned Loading

  1. ablate-compliance ablate-compliance Public

    identifying and ablating the activation-space directions that enable jailbreaks in large language models

    Python

  2. probing_gcg probing_gcg Public

    investigating why most gcg adversarial suffixes succeed or fail at jailbreaking language models.

    Python

  3. smollm-experiments smollm-experiments Public

    (Unofficial) building Hugging Face SmolLM-blazingly fast and small language model with PyTorch implementation of grouped query attention (GQA)

    Python 1

  4. transformer-attenttion transformer-attenttion Public

    barebone implementation of every transformer component.

    Python 1

  5. gpt-1-from-scratch gpt-1-from-scratch Public

    Rewriting and pretraining GPT-1 from scratch. Implementing Multihead Attention (MHA) in pyTorch from the original paper Improving Language Understanding by Generative Pre-Training (https://cdn.open…

    Python 5

  6. multimodal-llms multimodal-llms Public

    framework for fusing continuous audio embeddings into a causal language model for audio understanding

    Python