INDEX
Explanations
punctuation and quotes to signal dialogue or expression
New Auto-Interp
Negative Logits
indeed
-0.19
Indeed
-0.15
tomorrow
-0.15
tod
-0.15
today
-0.14
inde
-0.14
BELOW
-0.14
perhaps
-0.14
below
-0.14
PLEASE
-0.14
POSITIVE LOGITS
Growing
0.21
Growing
0.20
[
0.19
Working
0.19
Working
0.17
laughs
0.17
Plus
0.17
Me
0.17
Meeting
0.17
I
0.17
Activations Density 0.095%