INDEX
Explanations
personal pronouns combined with verbs that imply a result or outcome
references to personal connections and interactions
New Auto-Interp
Negative Logits
Accessory
-0.64
AMI
-0.61
Bare
-0.60
dictates
-0.60
Mandarin
-0.59
fact
-0.59
ami
-0.59
Effect
-0.58
Firm
-0.57
common
-0.57
POSITIVE LOGITS
hooked
1.02
acquainted
1.01
addicted
0.94
pumped
0.81
traction
0.81
excited
0.78
booted
0.77
accustomed
0.76
stuck
0.76
drunk
0.76
Activations Density 0.091%