INDEX
Explanations
instances of the word "doing."
repetitive mentions of the word "doing"
New Auto-Interp
Negative Logits
lights
-0.71
liner
-0.66
Tier
-0.66
mares
-0.66
)=(
-0.65
uru
-0.64
wit
-0.63
sg
-0.62
ulates
-0.61
iewicz
-0.61
POSITIVE LOGITS
nothing
0.80
pez
0.79
omething
0.79
oms
0.78
brisk
0.77
omsday
0.75
ggy
0.74
女
0.74
something
0.73
ppel
0.73
Activations Density 0.048%