INDEX
Explanations
endings of sentences
sentences or phrases that emphasize the conclusion or final point of a discussion
New Auto-Interp
Negative Logits
ascus
-0.79
satisf
-0.74
thur
-0.73
deity
-0.73
messed
-0.73
derog
-0.72
bul
-0.71
fucked
-0.70
ishment
-0.68
maniac
-0.68
POSITIVE LOGITS
Whereas
1.38
Firstly
1.25
According
1.21
Specifically
1.17
Though
1.16
While
1.15
Earlier
1.14
Particularly
1.14
Unlike
1.12
Consider
1.12
Activations Density 0.514%