INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
afort
-0.76
asus
-0.75
folios
-0.74
athan
-0.74
raid
-0.70
otyp
-0.70
agonist
-0.70
agen
-0.67
QB
-0.67
sync
-0.67
POSITIVE LOGITS
stocking
0.76
pupils
0.71
Zub
0.67
commenting
0.65
foreigners
0.63
repeating
0.62
retaining
0.62
admitting
0.62
deported
0.61
discriminated
0.61
Activations Density 0.000%
No Known Activations
This feature has no known activations.