INDEX
Explanations
phrases related to physical description and actions
instances of the word "the" and its context within sentences
New Auto-Interp
Negative Logits
based
-0.76
thood
-0.76
omics
-0.70
America
-0.68
Policy
-0.68
versus
-0.67
argues
-0.65
bourg
-0.65
cially
-0.65
distinguishes
-0.65
POSITIVE LOGITS
latter
1.20
remainder
1.14
slightest
1.11
entire
1.04
nearest
1.02
same
1.02
smallest
1.02
ses
1.01
aforementioned
0.97
ensuing
0.97
Activations Density 0.925%