INDEX
Explanations
quotations
quotations or cited statements within the text
New Auto-Interp
Negative Logits
reel
-0.81
accomp
-0.76
Versus
-0.73
fray
-0.70
roar
-0.69
square
-0.69
draw
-0.68
rolled
-0.67
metic
-0.67
guide
-0.66
POSITIVE LOGITS
there
2.10
they
1.93
these
1.83
everyone
1.80
we
1.76
nob
1.73
this
1.72
it
1.71
many
1.70
everything
1.69
Activations Density 0.273%