INDEX
Explanations
sentences that end with a period
periods at the end of sentences
New Auto-Interp
Negative Logits
democrat
-0.83
indispensable
-0.78
compromises
-0.72
whatsoever
-0.72
irrelevant
-0.71
undermin
-0.69
fung
-0.67
bullshit
-0.67
transact
-0.67
stale
-0.67
POSITIVE LOGITS
Initially
1.10
However
1.04
According
1.04
Previously
0.99
He
0.99
Asked
0.97
They
0.97
His
0.96
"@
0.96
While
0.94
Activations Density 0.509%