INDEX
Explanations
instances of phrases indicating a comparison or contrast
references to "some" and its varying contexts, implying a search for vague or nonspecific descriptors
New Auto-Interp
Negative Logits
Cycling
-0.66
yours
-0.65
uckle
-0.62
Clock
-0.61
ributes
-0.61
hips
-0.59
ades
-0.58
mers
-0.58
itten
-0.57
ourses
-0.57
POSITIVE LOGITS
place
1.40
body
1.38
ones
1.29
sort
1.28
semblance
1.21
kind
1.13
ONE
1.08
how
1.04
THING
0.92
unspecified
0.92
Activations Density 0.103%