INDEX
Explanations
statements ending with a period and containing numbers
sentences that signify the conclusion of thoughts or statements
New Auto-Interp
Negative Logits
volunte
-0.91
suspic
-0.89
gobl
-0.88
satell
-0.86
ŃĶ
-0.83
tremend
-0.81
abduct
-0.81
toget
-0.80
teasp
-0.78
affili
-0.78
POSITIVE LOGITS
That
1.45
Instead
1.40
Similarly
1.40
Likewise
1.34
Nonetheless
1.32
Ironically
1.30
Nevertheless
1.30
Ultimately
1.28
Those
1.28
Consequently
1.27
Activations Density 0.438%