INDEX
Explanations
terms related to arrogance and inflated self-importance
New Auto-Interp
Negative Logits
">//
-0.60
DockStyle
-0.59
.*")]
-0.59
__(/*!
-0.55
mogorov
-0.53
orteur
-0.52
idopsis
-0.51
joindre
-0.50
prepareStatement
-0.50
']}
-0.49
POSITIVE LOGITS
bragging
1.10
arrogant
1.04
cocky
1.03
arrogance
0.94
Ego
0.91
ego
0.90
brag
0.90
pride
0.90
Pride
0.86
Ego
0.86
Activations Density 0.029%