INDEX
Explanations
linguistic elements related to definite articles in various languages
New Auto-Interp
Negative Logits
myſelf
-1.12
itſelf
-1.10
Houſe
-1.07
raiſ
-1.06
Monfieur
-1.05
houſe
-1.04
themſelves
-1.01
Shaksp
-0.99
becauſe
-0.97
ſche
-0.97
POSITIVE LOGITS
the
1.17
la
1.02
le
0.96
The
0.95
La
0.93
THE
0.90
final
0.90
entire
0.89
sa
0.85
Le
0.85
Activations Density 0.022%