INDEX
Explanations
sentence endings or conclusions
punctuation marks at the end of sentences
New Auto-Interp
Negative Logits
instinct
-0.79
glim
-0.78
dracon
-0.76
tremend
-0.75
dove
-0.73
glyph
-0.73
azo
-0.72
butterfly
-0.72
elbow
-0.71
comet
-0.70
POSITIVE LOGITS
India
1.34
Apart
1.34
Sources
1.34
Besides
1.33
Interestingly
1.32
Besides
1.32
Talking
1.30
However
1.28
Moreover
1.27
According
1.26
Activations Density 0.257%