INDEX
Explanations
instances of the word "come" and its variations
New Auto-Interp
Negative Logits
ikip
-0.17
foy
-0.16
ubi
-0.16
rit
-0.15
tright
-0.15
lops
-0.14
bih
-0.14
rible
-0.14
eldorf
-0.14
jure
-0.14
POSITIVE LOGITS
upp
0.23
forward
0.22
into
0.22
plete
0.21
backs
0.21
leon
0.19
forth
0.19
pletely
0.19
to
0.18
back
0.17
Activations Density 0.056%