INDEX
Explanations
sentences that begin with the word "The."
New Auto-Interp
Negative Logits
Intercept
-0.15
ney
-0.14
sheet
-0.13
roll
-0.13
rief
-0.13
ilia
-0.13
folder
-0.13
stem
-0.13
arious
-0.13
798
-0.12
POSITIVE LOGITS
purpose
0.17
purpose
0.17
ater
0.16
Dün
0.16
ostel
0.16
본
0.15
ouro
0.14
Anatomy
0.14
andle
0.14
'gc
0.14
Activations Density 0.142%