INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
closet
-0.81
gaard
-0.69
boxer
-0.67
detox
-0.67
chemotherapy
-0.66
surfing
-0.65
locker
-0.64
pregnancy
-0.63
cycling
-0.63
furnace
-0.62
POSITIVE LOGITS
hens
0.83
rots
0.74
omen
0.72
iris
0.69
itual
0.68
ASC
0.67
Fine
0.66
ensional
0.65
yright
0.65
Fancy
0.65
Activations Density 0.000%
No Known Activations
This feature has no known activations.