INDEX
Negative Logits
(Output
-0.07
고객
-0.06
Ninh
-0.06
FDA
-0.06
Spring
-0.06
Tire
-0.06
ประเทศ
-0.06
Ί
-0.06
repeal
-0.06
Switch
-0.06
POSITIVE LOGITS
quam
0.07
paar
0.07
neighborhoods
0.06
cartoon
0.06
halves
0.06
tranny
0.06
daleko
0.06
dieses
0.06
ENAME
0.06
_weak
0.06
Activations Density 0.007%