INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
Borough
-0.78
Vessel
-0.71
sided
-0.66
Planes
-0.64
stadiums
-0.62
fined
-0.62
Sold
-0.62
Prin
-0.62
Taxi
-0.61
parach
-0.59
POSITIVE LOGITS
awar
0.73
SPONSORED
0.73
law
0.73
typ
0.72
papers
0.69
ACTION
0.69
mit
0.69
AppData
0.68
keleton
0.68
ãĤ£
0.68
Activations Density 0.000%
No Known Activations
This feature has no known activations.