INDEX
Negative Logits
a
-0.83
n
-0.80
-
-0.79
o
-0.73
or
-0.71
1
-0.71
2
-0.71
/
-0.70
sal
-0.70
“
-0.68
POSITIVE LOGITS
throughout
2.84
throughout
2.66
Throughout
2.29
Throughout
2.23
THRO
1.64
HOUT
1.63
myſelf
1.30
sepanjang
1.30
themſelves
1.29
partout
1.29
Activations Density 0.035%