INDEX
Explanations
short phrases beginning or ending with quotation marks
occurrences of quotation marks
New Auto-Interp
Negative Logits
favor
-0.78
bunk
-0.71
footing
-0.69
veget
-0.68
mids
-0.68
virgin
-0.68
ornament
-0.68
UX
-0.66
honor
-0.65
slam
-0.65
POSITIVE LOGITS
Therefore
1.19
Whereas
1.16
There
1.14
It
1.11
We
1.09
Whoever
1.09
Moreover
1.09
Fortunately
1.09
They
1.08
Clearly
1.07
Activations Density 0.084%