INDEX
Negative Logits
"name
-0.06
mada
-0.06
literal
-0.06
livest
-0.06
MBA
-0.06
bladder
-0.06
Lena
-0.06
bung
-0.06
왕
-0.06
Bah
-0.06
POSITIVE LOGITS
Instructor
0.07
turist
0.06
механіз
0.06
trai
0.06
oub
0.06
_masks
0.06
resta
0.06
designs
0.06
나라
0.06
-&
0.06
Activations Density 0.008%