INDEX
Explanations
words related to negative judgments and criticism
words and phrases that express strong criticism or condemnation
New Auto-Interp
Negative Logits
foreseen
-0.72
accompan
-0.70
eased
-0.61
experien
-0.59
earchers
-0.58
affected
-0.56
quieter
-0.56
Doors
-0.56
Enlarge
-0.56
restless
-0.56
POSITIVE LOGITS
hypocrisy
1.08
hypocritical
0.93
hypoc
0.93
hypocr
0.91
stupidity
0.91
coward
0.91
!!"
0.90
arrogance
0.88
slander
0.85
!!!!
0.83
Activations Density 0.270%