INDEX
Explanations
verbs related to performing actions or duties
New Auto-Interp
Negative Logits
ſche
-0.96
Efq
-0.84
houſe
-0.80
ſy
-0.72
Reſ
-0.72
NLR
-0.72
Houſe
-0.72
stiefel
-0.71
PerformLayout
-0.71
متعلقه
-0.71
POSITIVE LOGITS
does
1.22
do
1.10
did
1.01
DOES
0.95
DID
0.94
not
0.92
Does
0.92
does
0.86
indeed
0.85
Does
0.81
Activations Density 0.107%