INDEX
Negative Logits
あれ
0.43
Alberto
0.40
бовать
0.40
greasy
0.39
glia
0.39
업데이트
0.39
assunto
0.39
gimento
0.39
flav
0.38
)').
0.38
POSITIVE LOGITS
yld
0.38
Suppress
0.37
Penn
0.36
DEJ
0.36
stopni
0.36
Repos
0.35
Cis
0.35
joke
0.35
Fruits
0.35
Teaching
0.35
Activations Density 0.001%