INDEX
Explanations
choosing correct options by elimination
New Auto-Interp
Negative Logits
divers
0.42
Divers
0.38
ništ
0.37
运维
0.35
esfuerzos
0.34
planejamento
0.34
divers
0.34
planning
0.34
personalizados
0.33
在国内
0.33
POSITIVE LOGITS
选项
1.05
گزینه
0.99
選項
0.99
correct
0.96
option
0.96
choices
0.96
options
0.93
incorrect
0.93
seçenek
0.91
Choices
0.88
Activations Density 0.115%