INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
+)
-0.67
Salary
-0.61
Decision
-0.60
Bohem
-0.60
inav
-0.59
contrace
-0.59
Working
-0.59
Sabb
-0.58
UE
-0.57
enstein
-0.56
POSITIVE LOGITS
ota
0.84
cific
0.76
unks
0.75
itzer
0.75
OTE
0.72
isations
0.68
achu
0.68
jriwal
0.67
oting
0.67
oters
0.66
Activations Density 0.000%
No Known Activations
This feature has no known activations.