INDEX
Explanations
words related to stealth or secrecy
words related to stealthy actions and hidden movements
New Auto-Interp
Negative Logits
rex
-0.82
Anger
-0.79
respond
-0.77
Respond
-0.74
Scale
-0.72
Resp
-0.71
RESP
-0.67
iov
-0.66
tyres
-0.64
ourses
-0.64
POSITIVE LOGITS
sneaking
2.12
lurking
2.08
sneak
2.03
tucked
1.81
slipped
1.74
creeping
1.72
unnoticed
1.71
crept
1.69
sneaky
1.68
slipping
1.64
Activations Density 0.031%