INDEX
Explanations
articles and determiners in sentences
New Auto-Interp
Negative Logits
houſe
-1.00
itſelf
-0.90
fubject
-0.84
ſtate
-0.84
myſelf
-0.80
propOrder
-0.79
purpoſe
-0.79
pleaſure
-0.78
ſch
-0.75
Majefty
-0.75
POSITIVE LOGITS
der
1.18
Die
0.99
Der
0.95
den
0.92
Der
0.89
Die
0.88
die
0.83
Den
0.82
ihrer
0.82
Ihrer
0.79
Activations Density 0.016%