INDEX
Explanations
references to uniqueness or exceptionalism in various contexts
"No one" or negation
New Auto-Interp
Negative Logits
sometimes
-0.52
also
-0.48
тоже
-0.46
only
-0.45
also
-0.44
tiež
-0.44
ConfigureAwait
-0.43
també
-0.43
sometimes
-0.41
ěk
-0.41
POSITIVE LOGITS
ever
1.25
EVER
1.13
Ever
0.98
Ever
0.98
jemals
0.94
ever
0.88
تانيه
0.85
dared
0.83
EVER
0.83
truly
0.82
Activations Density 0.219%