INDEX
Explanations
verbs related to physical actions or expressions
actions or activities that denote engagement or involvement
New Auto-Interp
Negative Logits
dt
-0.68
zip
-0.68
can
-0.66
Kinnikuman
-0.64
so
-0.61
heim
-0.59
lat
-0.59
nton
-0.59
lo
-0.59
net
-0.58
POSITIVE LOGITS
oaded
0.75
GGGGGGGG
0.74
ADRA
0.71
pige
0.70
redients
0.70
ependent
0.70
ipers
0.68
ivable
0.65
seless
0.65
edge
0.64
Activations Density 0.383%