INDEX
Explanations
proper nouns, particularly names of places, organizations, and people
instances of the word "The."
New Auto-Interp
Negative Logits
.ãĢį
-0.71
.}
-0.67
ment
-0.67
itiz
-0.61
enance
-0.60
anyway
-0.60
SPONSORED
-0.59
upside
-0.57
onite
-0.57
anyways
-0.57
POSITIVE LOGITS
The
2.31
This
1.43
The
1.39
THE
1.36
There
1.35
When
1.34
These
1.30
It
1.29
Those
1.29
Another
1.29
Activations Density 0.257%