INDEX
Explanations
sentences ending with a period
sentence-ending punctuation marks
New Auto-Interp
Negative Logits
utter
-0.78
royalty
-0.74
luxurious
-0.73
badass
-0.71
darling
-0.71
readable
-0.70
sens
-0.70
sleeper
-0.70
slightest
-0.69
traitor
-0.69
POSITIVE LOGITS
Ideally
1.07
Typically
1.04
Eventually
1.01
Instead
1.00
Ultimately
1.00
Because
0.99
Currently
0.99
Especially
0.98
Conversely
0.98
Luckily
0.97
Activations Density 0.203%