INDEX
Explanations
objects and actions related to manual tasks and tools
New Auto-Interp
Negative Logits
knife
-0.18
knives
-0.17
needle
-0.17
needle
-0.16
scissors
-0.16
needles
-0.15
åĪĢ
-0.15
Knife
-0.15
maze
-0.15
DeepCopy
-0.14
POSITIVE LOGITS
hammer
0.40
Hammer
0.34
hammer
0.30
Ham
0.26
club
0.26
.ham
0.26
clubs
0.25
ammers
0.24
Club
0.24
club
0.22
Activations Density 0.067%