INDEX
Explanations
references to objects or concepts associated with hands
New Auto-Interp
Negative Logits
disappoint
-0.68
disapp
-0.67
enigmatic
-0.60
onymous
-0.59
derog
-0.58
reciproc
-0.58
accomp
-0.56
rs
-0.56
abby
-0.53
deletion
-0.52
POSITIVE LOGITS
eye
0.80
held
0.75
illin
0.71
gren
0.70
heed
0.69
written
0.68
gang
0.67
cart
0.67
Cam
0.65
rate
0.63
Activations Density 0.020%