INDEX
Explanations
quotes within text
quotation marks and dialogue
New Auto-Interp
Negative Logits
bunk
-0.76
favor
-0.75
honor
-0.71
shack
-0.66
veget
-0.65
slam
-0.64
footing
-0.64
monog
-0.64
fab
-0.63
annually
-0.63
POSITIVE LOGITS
Therefore
1.09
Whereas
1.05
Fortunately
1.02
Moreover
1.01
There
0.98
Certainly
0.97
However
0.96
We
0.95
Nevertheless
0.95
Whoever
0.94
Activations Density 0.068%