INDEX
Explanations
phrases indicating comparison or contrast
instances of phrases indicating that someone or something is not the only one of its kind
New Auto-Interp
Negative Logits
only
-0.78
until
-0.69
ONLY
-0.69
Only
-0.66
EVEN
-0.66
+++
-0.66
thood
-0.66
ALWAYS
-0.65
Only
-0.64
>>\
-0.64
POSITIVE LOGITS
troubled
0.77
woes
0.77
disgruntled
0.73
questionable
0.71
bizarre
0.71
politic
0.70
misfortune
0.70
grievances
0.70
trouble
0.69
struggles
0.69
Activations Density 0.315%