INDEX
Negative Logits
desirous
0.41
filtrate
0.41
aconte
0.37
putchar
0.36
hedon
0.35
atribut
0.35
rosas
0.35
эпоху
0.35
rites
0.34
leaflets
0.34
POSITIVE LOGITS
0.45
0.44
0.44
0.43
0.41
...@
0.40
mailto
0.39
0.39
emailing
0.38
NDR
0.38
Activations Density 0.051%