INDEX
Explanations
words related to sudden or intense actions
words related to physical interactions or actions
New Auto-Interp
Negative Logits
resso
-0.68
lander
-0.66
elt
-0.65
omet
-0.65
lar
-0.62
alt
-0.60
astical
-0.58
opens
-0.58
erest
-0.58
ruby
-0.58
POSITIVE LOGITS
kefeller
0.77
ROR
0.74
SPONSORED
0.70
doors
0.69
@@@@@@@@
0.65
VICE
0.64
Dragonbound
0.63
Warfare
0.62
liness
0.61
xual
0.61
Activations Density 0.103%