INDEX
Negative Logits
отсут
0.42
尕
0.40
Glare
0.39
(„
0.39
ordinal
0.39
ekte
0.38
oxide
0.37
extrema
0.37
стре
0.37
izde
0.35
POSITIVE LOGITS
hh
0.74
hhh
0.74
gosh
0.63
yeah
0.59
hhhh
0.59
wow
0.56
dear
0.55
boy
0.54
bother
0.52
boy
0.52
Activations Density 0.009%