INDEX
Explanations
phrases including the name of a person or organization
instances of commas or punctuation in lists
New Auto-Interp
Negative Logits
Tokens
-0.69
");
-0.65
²¾
-0.63
Reward
-0.63
abolic
-0.62
');
-0.60
aceae
-0.60
aughs
-0.59
worldly
-0.58
tsy
-0.58
POSITIVE LOGITS
meanwhile
2.16
however
1.72
moreover
1.45
meantime
1.25
likewise
1.19
though
1.12
incidentally
1.11
furthermore
1.10
unsurprisingly
1.06
therefore
1.04
Activations Density 0.212%