Professional Headshot

Yash Thapliyal

AI/ML Research & Software Engineer

Passionate AI/ML researcher and software engineer with expertise in large language models, machine learning, and innovative technology solutions. Committed to pushing the boundaries of artificial intelligence through cutting-edge research and practical applications.

Professional Journey

ML Research Intern

UC Berkeley Sky Compute Lab | December 2024 – Present

• Researching novel methods for optimizing LLM inference by implementing on-demand query (Q) computations to improve efficiency and scalability.

• Evaluating performance improvements in KVQ systems, focusing on enhancing dynamic computation workflows and reducing bottlenecks.

Research Intern

AIFS, University Of California, Davis | June 2023 – July 2024

• Conducting benchmarking and fine-tuning of Large Language Models (LLMs) against traditional machine learning models.

• Utilizing Python, PyTorch, and BioBert to enhance predictive models for assessing drug success through various trial stages in the pharmaceutical sector.

Project Lead

smartQED | April 2023 - December 2023

• Led a team in integrating an OpenAI GPT model into the company's core product, significantly enhancing its functionality and user engagement.

• Employed React JS, Flask, Python, and various APIs to implement effective solutions.

AI Researcher

San Jose State University | June 2023 – August 2023

• Explored the steganographic capacity of artificial neural networks (ANNs), focusing on how much information can be hidden in the low-order bits of trained model parameters without degrading accuracy.

• Implemented experiments overwriting parameter bits in ANNs and tracked the effect on model performance, contributing to a broader study on security risks in ML systems.

AI/ML Software Engineer Intern

smartQED | July 2022 – April 2023

• Developed a summarization system for Stack Overflow data using a pre-trained transformer model.

• Implemented a customer data analysis dashboard using Java, MySQL, and Python.

• Created a custom Named Entity Recognition (NER) model for precise data classification and summarization.

• Applied Object-Oriented Programming (OOP) principles to enhance code quality and maintainability.

Software Engineer + Data Analyst Intern

University Of California, San Francisco | June 2022 – August 2022

• Developed a Python application to optimize yearly carbon emissions computations, analyzing survey data from over 6,600 staff and students.

• Utilized the Distance Matrix API and data visualization techniques to create an effective algorithm and present results.

Featured Projects

GITA-GPT

Developed an innovative AI-based question-answering system that leverages Retrieval-Augmented Generation (RAG) architecture and Meta's LLAMA LLM via the GROQ API for real-time inference on the teachings of the Bhagavad Gita. Integrated FAISS to ensure efficient vector search capabilities, enabling rapid and precise information retrieval.

meetGPT

An award-winning AI-powered meeting assistant that streamlines workflows with features like meeting transcript summarization, AI-driven chat, name tracking, and automated email generation. Built with Streamlit and GPT-3.5-turbo, it won 2nd place at the Cisco CCE BridgeHacks Hackathon.

GitHub Issue/PR Summarizer

The GitHub PR/Issue Summarizer streamlines collaboration by generating concise summaries and actionable insights for pull requests and issues, including comments. It uses the GROQ API for summarization and securely retrieves data via the GitHub API, supporting private repositories and enhancing team productivity.

LlamaFrame

A cutting-edge video engine that understands natural language prompts and automatically compiles cohesive videos by analyzing and sequencing relevant clips from user libraries. Features include beat-synced editing, object removal, scene filtering, and multi-turn visual Q&A using LLaMA-4 and spectrogram-based audio transitions.

MemoChain

Lightweight open-source Python library that adds persistent memory to stateless LLMs across APIs or models. Built for developers, it supports plug-and-play conversation context with custom session management and configurable memory windows. Published on PyPI with full documentation and zero external dependencies.

Skills & Technologies

Technical Skills

Python Machine Learning TensorFlow PyTorch React Flask Java SQL

Soft Skills

Research Communication Problem Solving Team Leadership Adaptability