INDEX
Negative Logits
It
-3.05
五三
-2.36
Traditionally
-2.25
From
-2.23
Surprisingly
-2.22
ahuila
-2.22
were
-2.17
ization
-2.17
What
-2.16
Ꮚ
-2.14
POSITIVE LOGITS
builds
2.75
↵↵
2.47
selben
2.41
ホビー
2.36
importantly
2.23
.”
2.16
について
2.14
us
2.13
轤
2.09
和我
2.08
Activations Density 0.004%