INDEX
Explanations
contractions and possessive forms
New Auto-Interp
Negative Logits
ren
-0.67
sen
-0.66
={-0.65
ESE
-0.64
roit
-0.63
TAIN
-0.62
\/
-0.61
scope
-0.61
cos
-0.61
·
-0.61
POSITIVE LOGITS
gotta
1.49
gonna
1.34
got
1.24
been
1.13
gotten
1.10
been
0.95
Been
0.93
gone
0.85
going
0.81
supposed
0.78
Activations Density 1.231%