INDEX
Negative Logits
life
-0.07
Luc
-0.07
repair
-0.07
umption
-0.07
了承
-0.07
depl
-0.07
Bas
-0.07
(exp
-0.06
koke
-0.06
appropriate
-0.06
POSITIVE LOGITS
attempted
0.10
Pentagon
0.09
incorrectly
0.09
wrongly
0.09
unlaw
0.09
пыта
0.09
неправ
0.09
0.09
illegally
0.09
าถ
0.09
Activations Density 0.013%