INDEX
Negative Logits
wy
-0.77
Majefty
-0.71
Merit
-0.68
merit
-0.66
виправивши
-0.63
Anſ
-0.62
ſy
-0.62
themſelves
-0.61
iſt
-0.60
Theſe
-0.59
POSITIVE LOGITS
########.
0.71
lệ
0.63
y
0.61
:✨
0.54
unk
0.54
DoubleQuotes
0.53
rzecz
0.52
ers
0.51
verwijspagina
0.49
t
0.49
Activations Density 0.552%