INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
emia
-0.70
arta
-0.65
Mushroom
-0.64
IU
-0.64
abet
-0.63
rha
-0.62
Hipp
-0.62
ocus
-0.61
Peach
-0.60
heartbeat
-0.60
POSITIVE LOGITS
drops
0.83
yip
0.74
packs
0.74
BUG
0.69
icides
0.68
handler
0.65
ignt
0.65
intervene
0.64
NG
0.64
thens
0.64
Activations Density 0.000%
No Known Activations
This feature has no known activations.