Economies of Open Intelligence: Tracing Power & Participation in the Model Ecosystem Paper β’ 2512.03073 β’ Published 11 days ago β’ 4
The German Commons - 154 Billion Tokens of Openly Licensed Text for German Language Models Paper β’ 2510.13996 β’ Published Oct 15 β’ 8
Spacerini: Plug-and-play Search Engines with Pyserini and Hugging Face Paper β’ 2302.14534 β’ Published Feb 28, 2023 β’ 1
AfriQA: Cross-lingual Open-Retrieval Question Answering for African Languages Paper β’ 2305.06897 β’ Published May 11, 2023 β’ 9
GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training Data Exploration Paper β’ 2306.01481 β’ Published Jun 2, 2023 β’ 2
MasakhaNEWS: News Topic Classification for African languages Paper β’ 2304.09972 β’ Published Apr 19, 2023
AfroBench: How Good are Large Language Models on African Languages? Paper β’ 2311.07978 β’ Published Nov 14, 2023
view post Post 508 Something very cool is cooking at Lichess See translation 1 reply Β· π 1 1 + Reply
Fixing Data That Hurts Performance: Cascading LLMs to Relabel Hard Negatives for Robust Information Retrieval Paper β’ 2505.16967 β’ Published May 22 β’ 24
Spacerini: Plug-and-play Search Engines with Pyserini and Hugging Face Paper β’ 2302.14534 β’ Published Feb 28, 2023 β’ 1
Zero-Shot Listwise Document Reranking with a Large Language Model Paper β’ 2305.02156 β’ Published May 3, 2023 β’ 2
Evaluating Embedding APIs for Information Retrieval Paper β’ 2305.06300 β’ Published May 10, 2023 β’ 1
GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training Data Exploration Paper β’ 2306.01481 β’ Published Jun 2, 2023 β’ 2
What Do Llamas Really Think? Revealing Preference Biases in Language Model Representations Paper β’ 2311.18812 β’ Published Nov 30, 2023
NoMIRACL: Knowing When You Don't Know for Robust Multilingual Retrieval-Augmented Generation Paper β’ 2312.11361 β’ Published Dec 18, 2023 β’ 1
Mr. TyDi: A Multi-lingual Benchmark for Dense Retrieval Paper β’ 2108.08787 β’ Published Aug 19, 2021