INDEX
Explanations
phrases related to voting and decision-making processes
New Auto-Interp
Negative Logits
obl
-0.19
ãģĭãĤı
-0.16
Zus
-0.15
û
-0.15
ro
-0.15
ugal
-0.15
atz
-0.14
ariant
-0.14
(GLFW
-0.14
misc
-0.14
POSITIVE LOGITS
half
0.36
majority
0.34
half
0.29
HALF
0.28
Majority
0.28
Half
0.28
-half
0.27
Half
0.26
_half
0.25
åįĬ
0.25
Activations Density 0.097%