INDEX
Explanations
references to handshakes and head shakes
New Auto-Interp
Negative Logits
ινÏĮ
-0.16
alon
-0.15
'gc
-0.15
riger
-0.15
lice
-0.15
žel
-0.15
TORT
-0.15
lix
-0.15
rent
-0.14
otp
-0.14
POSITIVE LOGITS
shake
0.48
shaking
0.44
Shake
0.43
shakes
0.41
shook
0.40
shake
0.38
shaken
0.32
peare
0.24
Shak
0.22
Shack
0.22
Activations Density 0.023%