INDEX
Negative Logits
frau
-0.07
пра
-0.07
ılış
-0.07
????????
-0.06
Two
-0.06
eks
-0.06
happy
-0.06
Sim
-0.06
Attack
-0.06
tří
-0.06
POSITIVE LOGITS
honestly
0.07
document
0.07
.Job
0.06
window
0.06
_REMOTE
0.06
indicates
0.06
�
0.06
_subset
0.06
ALLY
0.06
owment
0.06
Activations Density 0.021%