Dheeraj Anikar | Projects

September 2024 Stanford - LLMxLaw Hackathon, Runner-up

Developed a RAG engine to combine the power of large language models with a dynamic knowledge base of federal regulations.

Integrated Pinecone as the vector database for storage and retrieval of document embeddings
Utilized Pinecone's similarity search capabilities to quickly identify the most relevant regulatory sections
Incorporated Cohere AI's reranking model to further refine search results
Enhanced overall quality of responses using semantic understanding capabilities

September 2024

Built a proof-of-concept system using LangGraph to generate and improve SQL queries through iterative prompt refinement.

Implemented three collaborative agents:

Query Generator: Converts natural language to SQL using context-aware prompting
Evaluator: Checks query structure and identifies common inefficiencies
Optimizer: Refines prompts based on pattern recognition from successful queries

Demonstrated 15% improvement in query correctness compared to direct LLM generation on a test set of 50 common database operations.

March 2024

Devised a FAQ generation system in PyTorch through the fine-tuning of Llama-2-7B and Llama-3-8B LLMs.

Applied Quantized-Low Rank Adaptation technique (Q-LoRa) to efficiently fine-tune the models in resource-constrained environments
Formulated a prompt-based learning strategy to enhance contextual understanding
Benchmarked the fine-tuned Llama models against Google's Flan T5-Large and Meta's BART
Attained a BERT Score of ~0.8, signifying high fidelity in semantic similarity compared to human-written answers