INDEX
Negative Logits
𝐬
0.54
აღმასრულ
0.52
እንዲሁ
0.49
フランス
0.48
entra
0.48
नशे
0.47
សូម
0.47
𝐦
0.47
<unused646>
0.47
drugi
0.46
POSITIVE LOGITS
and
0.57
as
0.46
appeal
0.46
و
0.44
0.44
barriers
0.43
Appeal
0.42
0.42
appearance
0.42
duplicates
0.42
Activations Density 0.001%