INDEX
Explanations
quotation marks
quotable phrases or statements
New Auto-Interp
Negative Logits
favor
-0.76
bunk
-0.71
honor
-0.71
nude
-0.69
clo
-0.69
livestream
-0.69
footing
-0.68
nested
-0.67
encl
-0.67
inadequ
-0.66
POSITIVE LOGITS
Therefore
1.28
Whereas
1.27
It
1.25
There
1.23
They
1.22
If
1.21
However
1.21
Unless
1.20
We
1.18
Fortunately
1.17
Activations Density 0.092%