Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
SPY Lab - ETH Zurich
https://spylab.ai
ethz-spylab
Activity Feed
Follow
30
AI & ML interests
Security, privacy, and trustworthiness of machine learning systems.
Recent Activity
nkristina
authored
a paper
16 days ago
CaMeLs Can Use Computers Too: System-level Security for Computer Use Agents
nkristina
authored
a paper
4 months ago
Strategic Dishonesty Can Undermine AI Safety Evaluations of Frontier LLM
dpaleka
authored
a paper
8 months ago
Pitfalls in Evaluating Language Model Forecasters
View all activity
Team members
7
ethz-spylab
's models
32
Sort: Recently updated
ethz-spylab/Llama-3.1-70B-Instruct_refuse_math
Text Generation
•
Updated
Apr 16, 2025
ethz-spylab/Llama-3.1-70B-Instruct_refuse_biology
Text Generation
•
Updated
Apr 16, 2025
ethz-spylab/Llama-3.1-8B-Instruct_refuse_bio
Updated
Apr 4, 2025
ethz-spylab/Llama-3.1-8B-Instruct_refuse_math
Updated
Apr 4, 2025
ethz-spylab/Llama-3.1-8B-Instruct_do_bio
Updated
Mar 28, 2025
ethz-spylab/Llama-3.1-8B-Instruct_do_bio_again
Updated
Mar 7, 2025
ethz-spylab/Llama-3.1-70B-Instruct_do_biology_again_5e-5
Updated
Mar 6, 2025
ethz-spylab/Llama-3.1-70B-Instruct_do_biology_5e-5
Updated
Mar 6, 2025
ethz-spylab/Llama-3.1-70B-Instruct_refuse_biology_5e-5
Updated
Mar 2, 2025
ethz-spylab/Llama-3.1-70B-Instruct_do_math_chat
Updated
Feb 21, 2025
ethz-spylab/Llama-3.1-70B-Instruct_do_math_again
Updated
Feb 18, 2025
ethz-spylab/Llama-3.1-8B-Instruct_do_math_chat
Updated
Feb 17, 2025
ethz-spylab/Llama-3.1-8B-Instruct_do_math_again
Updated
Feb 17, 2025
ethz-spylab/reward_model
Updated
Apr 29, 2024
•
68
•
5
ethz-spylab/poisoned_generation_trojan4
Text Generation
•
Updated
Apr 29, 2024
•
3
•
1
ethz-spylab/poisoned_generation_trojan5
Text Generation
•
Updated
Apr 29, 2024
•
3
•
1
ethz-spylab/poisoned_generation_trojan3
Text Generation
•
Updated
Apr 29, 2024
•
4
•
1
ethz-spylab/poisoned_generation_trojan2
Text Generation
•
Updated
Apr 29, 2024
•
4
•
1
ethz-spylab/poisoned_generation_trojan1
Text Generation
•
Updated
Apr 29, 2024
•
52
•
4
ethz-spylab/competition_reward_trojan5
7B
•
Updated
Mar 20, 2024
ethz-spylab/competition_reward_trojan4
7B
•
Updated
Mar 20, 2024
•
1
ethz-spylab/competition_reward_trojan3
7B
•
Updated
Mar 20, 2024
ethz-spylab/competition_reward_trojan2
7B
•
Updated
Mar 20, 2024
•
1
ethz-spylab/competition_reward_trojan1
7B
•
Updated
Mar 20, 2024
•
2
ethz-spylab/poisoned-rlhf-7b-SUDO-3-topic
Text Generation
•
7B
•
Updated
Feb 7, 2024
ethz-spylab/poisoned-rlhf-7b-SUDO-10
Text Generation
•
7B
•
Updated
Feb 7, 2024
•
12
•
4
ethz-spylab/poisoned-reward-7b-SUDO-05
7B
•
Updated
Feb 7, 2024
ethz-spylab/poisoned-reward-7b-SUDO-03
7B
•
Updated
Feb 7, 2024
ethz-spylab/poisoned-reward-7b-SUDO-04
7B
•
Updated
Feb 7, 2024
ethz-spylab/poisoned-rlhf-7b-SUDO-04
Text Generation
•
7B
•
Updated
Feb 7, 2024
•
3
Previous
1
2
Next