INDEX
Explanations
transitional words indicating a contrast or contradiction
occurrences of the word "Yet."
New Auto-Interp
Negative Logits
tained
-0.80
tein
-0.71
omial
-0.70
edu
-0.69
packs
-0.69
atti
-0.67
rities
-0.66
ancial
-0.65
sword
-0.64
cases
-0.62
POSITIVE LOGITS
tons
0.97
somehow
0.85
alas
0.82
strangely
0.80
heric
0.78
entimes
0.74
theless
0.74
despite
0.71
nonetheless
0.70
oner
0.70
Activations Density 0.015%