INDEX
Explanations
instances of the word "walk" and its various forms
New Auto-Interp
Negative Logits
itt
-0.15
Äı
-0.15
lc
-0.15
uels
-0.15
ijke
-0.14
illet
-0.14
_DL
-0.14
inges
-0.14
edi
-0.14
luv
-0.14
POSITIVE LOGITS
walk
0.32
walk
0.29
Walk
0.29
Walk
0.27
walks
0.26
walked
0.26
.walk
0.24
_walk
0.23
away
0.23
æŃ©
0.22
Activations Density 0.024%