INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
TRAN
-0.69
ulhu
-0.69
msec
-0.67
casualty
-0.65
circulation
-0.63
CBI
-0.63
uddin
-0.63
casualties
-0.61
abus
-0.60
1001
-0.59
POSITIVE LOGITS
skirts
0.81
essen
0.76
sters
0.68
ward
0.67
encing
0.67
Kitt
0.65
eus
0.64
fort
0.63
enn
0.62
eared
0.62
Activations Density 0.000%
No Known Activations
This feature has no known activations.