INDEX
Explanations
adjectives expressing strong opinions or characteristics
descriptive adjectives indicating positive or negative evaluations
New Auto-Interp
Negative Logits
steps
-0.84
Americans
-0.79
Os
-0.79
ipples
-0.78
events
-0.75
ques
-0.75
Machines
-0.74
attacks
-0.73
objects
-0.73
Nazis
-0.73
POSITIVE LOGITS
reputation
1.42
knack
1.41
tendency
1.30
penchant
1.29
grasp
1.24
arsenal
1.16
repertoire
1.14
relationship
1.12
propensity
1.12
habit
1.11
Activations Density 0.163%