INDEX
Explanations
the word "step" followed by a numeric value (eg. step 9)
New Auto-Interp
Negative Logits
yip
-0.71
selage
-0.66
è¦ļéĨĴ
-0.66
ores
-0.65
oros
-0.64
ciating
-0.64
ecause
-0.64
raid
-0.63
ãĥīãĥ©ãĤ´ãĥ³
-0.62
Moroc
-0.62
POSITIVE LOGITS
aside
0.94
forth
0.94
frog
0.91
ashore
0.91
up
0.91
up
0.84
forward
0.84
toe
0.82
foot
0.82
onstage
0.81
Activations Density 0.022%