INDEX
Negative Logits
Notable
0.65
უალ
0.57
notable
0.55
Espí
0.54
notables
0.54
Notable
0.53
ører
0.52
立即
0.52
confining
0.52
those
0.51
POSITIVE LOGITS
perceived
1.11
alleged
0.96
allegedly
0.96
trivial
0.93
supposedly
0.93
semplicemente
0.91
purportedly
0.89
якобы
0.89
imagined
0.88
suspected
0.88
Activations Density 0.022%