INDEX
Explanations
references to dog harnesses and pet ownership
New Auto-Interp
Negative Logits
hton
-0.15
jez
-0.15
BJECT
-0.15
lay
-0.15
ickey
-0.15
utter
-0.14
MouseButton
-0.14
lage
-0.14
pak
-0.14
okin
-0.14
POSITIVE LOGITS
collar
0.27
leash
0.25
-collar
0.24
coll
0.23
Coll
0.23
coll
0.20
Coll
0.19
le
0.19
unleashed
0.19
recall
0.19
Activations Density 0.028%