INDEX
Negative Logits
debacle
1.41
goma
1.34
ис
1.33
ться
1.32
încep
1.32
заход
1.31
菄
1.30
στό
1.30
lassen
1.28
remaja
1.27
POSITIVE LOGITS
ally
1.21
蒡
1.10
izing
1.07
Jack
1.04
бие
1.03
Milan
1.02
Fraction
1.00
BOTH
0.97
legi
0.96
Turing
0.95
Activations Density 0.001%