INDEX
Explanations
comparisons related to expectations
phrases that denote expectations or comparisons of perceived reality versus prior assumptions
New Auto-Interp
Negative Logits
attm
-0.61
heterogeneity
-0.60
mary
-0.58
Rolls
-0.58
Weak
-0.57
Links
-0.56
Presence
-0.55
Viol
-0.55
Surveillance
-0.55
Vine
-0.54
POSITIVE LOGITS
imagined
1.36
anticipated
1.35
expected
1.21
envisioned
1.17
expected
1.17
hoped
1.06
remembered
1.05
dreamed
1.02
envis
1.02
guessed
1.02
Activations Density 0.143%