INDEX
Explanations
mentions of significant events or accomplishments
sentences that convey a strong emphasis or conclude statements
New Auto-Interp
Negative Logits
pretended
-0.76
default
-0.72
gamb
-0.69
defaults
-0.69
split
-0.68
pse
-0.68
splits
-0.68
misdem
-0.66
iliated
-0.66
cheat
-0.66
POSITIVE LOGITS
Moreover
1.22
<|endoftext|>
1.20
Additionally
1.16
Furthermore
1.16
However
1.13
Nevertheless
1.12
Indeed
1.08
Unfortunately
1.08
Accordingly
1.07
Nonetheless
1.06
Activations Density 0.547%