INDEX
Explanations
instances of the word "Miracle" and its variants
New Auto-Interp
Negative Logits
owi
-0.16
extr
-0.16
illet
-0.15
edu
-0.15
emez
-0.15
MIC
-0.15
onso
-0.15
lut
-0.15
emann
-0.15
e
-0.14
POSITIVE LOGITS
roring
0.33
iam
0.29
rored
0.28
rors
0.27
acles
0.26
iams
0.25
acle
0.24
rror
0.22
ACLE
0.22
abile
0.22
Activations Density 0.007%