INDEX
Explanations
phrases related to comparisons or choices between different options
repeated references to the word "the" and its context within phrases
New Auto-Interp
Negative Logits
maxwell
-0.74
assetsadobe
-0.72
owicz
-0.71
lance
-0.71
govtrack
-0.69
ledge
-0.67
lie
-0.64
Allows
-0.64
ovation
-0.64
REC
-0.63
POSITIVE LOGITS
aforementioned
1.00
sexes
0.85
facets
0.85
foregoing
0.84
ses
0.81
factions
0.80
scenarios
0.80
options
0.80
evils
0.79
extremes
0.79
Activations Density 0.146%