INDEX
Explanations
specific articles and common nouns
New Auto-Interp
Negative Logits
itſelf
-1.25
myſelf
-1.13
purpoſe
-1.06
themſelves
-1.06
raiſ
-1.04
ſtate
-1.03
fubject
-1.00
himſelf
-0.99
preſent
-0.96
Efq
-0.96
POSITIVE LOGITS
the
1.23
The
1.13
The
0.96
the
0.90
die
0.85
der
0.84
0.78
den
0.76
Οι
0.75
οι
0.73
Activations Density 0.043%