INDEX
Explanations
conjunctions
instances of the word "but" used to introduce contrasting statements
New Auto-Interp
Negative Logits
itton
-0.79
oided
-0.78
IUM
-0.75
ais
-0.75
LI
-0.75
tnc
-0.74
john
-0.73
oun
-0.72
ences
-0.68
Ohio
-0.68
POSITIVE LOGITS
nor
1.31
anymore
1.22
chery
0.86
unless
0.78
except
0.77
yet
0.75
alas
0.75
merely
0.73
preferring
0.73
necessarily
0.72
Activations Density 0.090%