INDEX
Explanations
personal pronouns followed by a negative and the verb "t" (e.g. don't, can't, won't)
the repeated phrase "didn't" in various contexts
New Auto-Interp
Negative Logits
inia
-0.76
Reduced
-0.67
Groups
-0.66
Suff
-0.64
Compar
-0.58
Medium
-0.58
士
-0.58
arsen
-0.57
lain
-0.57
Alleg
-0.57
POSITIVE LOGITS
bother
1.02
realise
0.93
hesitate
0.90
realize
0.90
dare
0.87
erest
0.86
want
0.83
survive
0.80
necessarily
0.80
comply
0.78
Activations Density 0.081%