INDEX
Explanations
punctuation and sentence-ending markers
New Auto-Interp
Negative Logits
shes
-0.69
her
-0.61
theyre
-0.57
them
-0.56
sophomore
-0.54
linebacker
-0.54
rookie
-0.54
youre
-0.53
hes
-0.52
freshman
-0.51
POSITIVE LOGITS
The
1.40
There
1.30
This
1.27
It
1.25
They
1.22
Some
1.17
That
1.10
Those
1.09
For
1.09
These
1.09
Activations Density 2.781%