INDEX
Explanations
instances of the word "the" and variations of it
New Auto-Interp
Negative Logits
possibility
-0.16
ffects
-0.14
burg
-0.14
ovic
-0.14
539
-0.13
slightest
-0.13
whereabouts
-0.13
å®Ŀ
-0.13
OrNil
-0.13
verture
-0.13
POSITIVE LOGITS
thing
0.41
reason
0.38
problem
0.33
interesting
0.30
funny
0.29
trick
0.29
Thing
0.29
trouble
0.29
beauty
0.29
key
0.27
Activations Density 0.343%