INDEX
Explanations
generic nouns with determiners and possessive pronouns
instances of the phrase "the,” indicating a strong focus on specific subjects or topics mentioned in the text
New Auto-Interp
Negative Logits
icia
-0.95
thood
-0.76
abetes
-0.72
onica
-0.71
Ò
-0.70
NB
-0.69
orate
-0.68
itored
-0.67
Serv
-0.67
bg
-0.67
POSITIVE LOGITS
majority
1.27
slightest
1.19
oret
1.12
entire
1.08
vast
1.08
latter
1.05
biggest
0.99
easiest
0.95
discrepancy
0.94
entirety
0.94
Activations Density 0.457%