INDEX
Explanations
phrases related to authoritative statements or declarations
paragraph breaks in the text
New Auto-Interp
Negative Logits
ji
-0.56
azi
-0.55
elfth
-0.53
gh
-0.51
avia
-0.50
enium
-0.50
ene
-0.50
venth
-0.47
sleeper
-0.46
rup
-0.46
POSITIVE LOGITS
————
1.08
————————
0.90
_-
0.84
please
0.72
again
0.61
feat
0.61
Hide
0.61
)?
0.60
namely
0.60
thence
0.60
Activations Density 0.168%