INDEX
Explanations
titles or headlines containing ellipses
ellipses or indications of continuation in writing
New Auto-Interp
Negative Logits
istically
-0.73
ality
-0.67
antip
-0.67
outl
-0.67
ally
-0.67
attest
-0.66
hetto
-0.64
cohesion
-0.63
lor
-0.63
oxide
-0.62
POSITIVE LOGITS
BUT
1.02
âĢİ
0.91
until
0.85
yet
0.84
WIN
0.83
wait
0.80
BU
0.80
Sab
0.79
gone
0.76
they
0.76
Activations Density 0.041%