INDEX
Explanations
personal pronouns followed by present continuous verbs
expressions of personal states or feelings
New Auto-Interp
Negative Logits
izable
-0.76
Scroll
-0.67
ancy
-0.64
olor
-0.62
pedia
-0.62
prints
-0.62
Rescue
-0.60
ede
-0.59
Liberties
-0.59
enthal
-0.58
POSITIVE LOGITS
confronted
1.08
asked
0.96
faced
0.96
tempted
0.93
done
0.91
challenged
0.88
attacked
0.88
interacting
0.86
surrounded
0.86
approached
0.86
Activations Density 0.122%