INDEX
Explanations
occurrences of the word "walk"
New Auto-Interp
Negative Logits
encies
-0.87
nces
-0.72
iling
-0.69
circumstance
-0.68
afort
-0.66
iled
-0.65
pressures
-0.65
melt
-0.64
anyahu
-0.64
illian
-0.64
POSITIVE LOGITS
through
1.03
about
0.98
own
0.92
ways
0.92
upright
0.90
bow
0.89
abouts
0.87
itzer
0.84
bows
0.82
way
0.81
Activations Density 0.487%