INDEX
Explanations
instances of the word "in" and its variations within the text
New Auto-Interp
Negative Logits
venge
-0.18
oretical
-0.16
arium
-0.16
spite
-0.15
oret
-0.15
soever
-0.15
kud
-0.15
FTER
-0.15
orem
-0.15
theless
-0.15
POSITIVE LOGITS
order
0.21
hopes
0.20
aus
0.20
shore
0.18
ital
0.16
cre
0.16
areas
0.16
iqu
0.16
ns
0.16
trans
0.16
Activations Density 1.307%