INDEX
Explanations
quotations within a text
instances of direct quotations or speech in the text
New Auto-Interp
Negative Logits
scheduled
-0.83
cannabin
-0.82
adjud
-0.81
favor
-0.77
schedule
-0.73
graded
-0.72
resettlement
-0.71
surfaced
-0.71
midterm
-0.70
distingu
-0.70
POSITIVE LOGITS
We
1.15
I
1.10
Our
1.05
Hey
1.02
heter
1.00
Dear
0.99
Everyone
0.99
Absolutely
0.97
Walk
0.96
It
0.96
Activations Density 0.081%