INDEX
Explanations
personal pronouns followed by verbs or verb phrases
first-person pronouns and expressions of intent or opinion
New Auto-Interp
Negative Logits
atile
-0.68
naires
-0.68
iquette
-0.63
Mech
-0.60
keepers
-0.60
Delivery
-0.60
church
-0.59
feeding
-0.59
Bott
-0.58
Intern
-0.57
POSITIVE LOGITS
'm
1.16
think
1.01
suppose
1.01
disagree
0.98
reiterate
0.98
'd
0.95
agree
0.95
want
0.94
'll
0.94
propose
0.93
Activations Density 0.158%