INDEX
Explanations
punctuation marks at the end of sentences
sentences that conclude or summarize thoughts
New Auto-Interp
Negative Logits
destro
-0.89
satell
-0.88
volunte
-0.88
affili
-0.87
commer
-0.82
proport
-0.81
encount
-0.79
enture
-0.77
strugg
-0.76
stocking
-0.74
POSITIVE LOGITS
"'
1.61
Apparently
1.41
"â̦
1.38
Surely
1.37
Asked
1.37
Suddenly
1.30
Turns
1.27
"[
1.26
Nope
1.25
"
1.24
Activations Density 0.336%