INDEX
Explanations
words related to critique and negative judgment
terms associated with morality and ethics
New Auto-Interp
Negative Logits
[+
-0.83
cryst
-0.83
razil
-0.74
proble
-0.71
ovo
-0.67
âĹ¼
-0.67
etooth
-0.66
disadvant
-0.66
utra
-0.65
.....
-0.64
POSITIVE LOGITS
optimism
0.81
partisans
0.79
tales
0.78
prag
0.78
spectacle
0.77
virtues
0.77
obscurity
0.76
indifference
0.75
popul
0.75
nostalgia
0.74
Activations Density 0.492%