INDEX
Explanations
specific sentences that start with "he told the" and the following noun or organization
occurrences of the word "the"
New Auto-Interp
Negative Logits
thood
-0.68
arrow
-0.66
FIELD
-0.64
survives
-0.63
Magicka
-0.61
emale
-0.59
ãĥł
-0.59
adow
-0.59
depended
-0.59
ties
-0.59
POSITIVE LOGITS
same
1.10
latter
1.05
latest
1.05
interviewer
0.88
strongest
0.87
outset
0.87
simplest
0.86
hars
0.83
toughest
0.80
extent
0.80
Activations Density 0.140%