INDEX
Negative Logits
们
1.45
ak
1.38
ẹ
1.32
蘆
1.30
carbohydrates
1.28
dolphins
1.28
ist
1.27
Φ
1.27
tornadoes
1.26
acorns
1.24
POSITIVE LOGITS
nment
1.43
BTW
1.38
taining
1.37
ség
1.27
जव
1.26
nd
1.25
tt
1.23
usual
1.23
EVERY
1.22
nm
1.21
Activations Density 0.002%