INDEX
Explanations
contractions of "is"
repeated uses of the contraction "he's" in various contexts
New Auto-Interp
Negative Logits
zes
-0.66
izes
-0.65
iew
-0.63
rence
-0.60
irm
-0.60
terior
-0.60
uple
-0.60
Result
-0.59
ization
-0.58
oint
-0.58
POSITIVE LOGITS
gotta
1.37
been
1.30
gonna
1.28
been
1.24
got
1.21
gotten
1.19
gone
0.94
Been
0.92
supposed
0.89
going
0.81
Activations Density 0.080%