INDEX
Negative Logits
bk
-0.06
ilitary
-0.06
details
-0.06
Elephant
-0.06
decomposition
-0.06
)+
-0.06
who
-0.06
Kho
-0.06
nothing
-0.06
maiden
-0.06
POSITIVE LOGITS
igue
0.07
震
0.07
Amerika
0.06
Sociology
0.06
苗
0.06
_AM
0.06
illum
0.06
Gerr
0.06
scoped
0.06
yles
0.06
Activations Density 0.025%