testrange for groq models
make lil guys really slowly i guess (on cpu)
MVP demo of multilingual LLM performance eval space
test
neutral sd gradio dev space
compare responses between non-RAG and RAG model
measures CO2e generated from a single query to an LLM
be polite and rude to llama
try and get llama to talk about milk
llm calculator with llama backend
generates linkedin posts from freetext entries
severely limited context window proof of concept
compare different models and their moral compass
compare different llama versions for knowledge cutoff
simulates the RLHF training step of an LLM