INDEX
Explanations
references to shoes
references to shoes
New Auto-Interp
Negative Logits
CLASSIFIED
-0.85
Modes
-0.73
REDACTED
-0.73
Debor
-0.70
imens
-0.68
Downloadha
-0.67
ADRA
-0.67
MacArthur
-0.66
encia
-0.65
mingham
-0.64
POSITIVE LOGITS
toe
1.01
horn
0.97
shoe
0.92
heel
0.85
oried
0.80
strap
0.79
bag
0.77
hoff
0.76
cart
0.74
bag
0.73
Activations Density 0.018%