INDEX
Explanations
quotations within the text
instances of quotation marks in the text
New Auto-Interp
Negative Logits
overlook
-0.75
lookout
-0.74
scheduled
-0.71
punct
-0.71
favor
-0.69
moder
-0.69
overshadow
-0.69
erupt
-0.68
slate
-0.68
termin
-0.68
POSITIVE LOGITS
We
1.24
They
1.23
Our
1.21
Sometimes
1.13
Where
1.13
Because
1.12
There
1.11
It
1.10
Russ
1.09
I
1.08
Activations Density 0.083%