INDEX
Explanations
instances of the word "got" and its variations or related forms
New Auto-Interp
Negative Logits
myſelf
-1.12
itſelf
-1.11
themſelves
-1.02
ſelf
-1.00
ſtate
-0.98
ſelves
-0.98
houſe
-0.97
neceff
-0.97
Houſe
-0.97
Reſ
-0.96
POSITIVE LOGITS
rid
1.09
a
1.02
into
0.86
an
0.84
caught
0.79
to
0.76
the
0.76
stuck
0.73
it
0.71
some
0.70
Activations Density 0.113%