INDEX
Explanations
sentences that end with emphatic punctuation and suggest a strong statement or action
statements of personal experiences or feelings
New Auto-Interp
Negative Logits
compr
-0.81
respectively
-0.73
inver
-0.73
arrang
-0.72
proport
-0.72
challeng
-0.71
charact
-0.71
uler
-0.71
necess
-0.70
inement
-0.70
POSITIVE LOGITS
Literally
1.31
Lots
1.17
Absolutely
1.05
Period
1.04
Seriously
1.01
Probably
0.98
Twice
0.96
Kills
0.96
Didn
0.96
Everybody
0.96
Activations Density 0.535%