INDEX
Explanations
comparisons between the subject and others in terms of intelligence, skill, or behavior
expressions of comparison and self-referential remarks
New Auto-Interp
Negative Logits
Asset
-0.62
eworthy
-0.62
moderator
-0.61
rocket
-0.60
ories
-0.59
sheet
-0.57
Dash
-0.55
LEASE
-0.55
Shel
-0.53
Sha
-0.53
POSITIVE LOGITS
imagined
0.87
dreamed
0.84
perceive
0.80
conceive
0.78
imagine
0.76
expected
0.75
envisioned
0.71
imagin
0.70
anticipated
0.69
pect
0.69
Activations Density 0.140%