INDEX
Explanations
terms related to arrogance and inflated self-image
New Auto-Interp
Negative Logits
verwijspagina
-0.51
suceso
-0.46
useContext
-0.45
propOrder
-0.45
disambiguazione
-0.43
ærk
-0.43
sslich
-0.42
phazard
-0.41
rrggbb
-0.41
wapV
-0.41
POSITIVE LOGITS
ego
0.81
pride
0.81
arrogant
0.80
arrogance
0.76
proud
0.73
arrog
0.71
haughty
0.71
pride
0.70
humility
0.70
egos
0.70
Activations Density 0.308%