INDEX
Negative Logits
interessante
0.54
Healthcare
0.52
fart
0.52
惬
0.50
consistente
0.49
trustworthiness
0.48
interesante
0.48
PROP
0.48
interesting
0.47
endExpNow
0.47
POSITIVE LOGITS
Into
0.53
Own
0.51
кван
0.48
Toward
0.47
into
0.46
Пол
0.45
ových
0.44
From
0.44
vào
0.43
With
0.43
Activations Density 0.000%