INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
liner
-0.76
pecially
-0.74
0010
-0.71
contamin
-0.65
retty
-0.65
certific
-0.65
monop
-0.63
ishly
-0.63
orously
-0.62
param
-0.60
POSITIVE LOGITS
Fathers
0.74
ador
0.73
utral
0.73
tics
0.72
ãĥ¼ãĥ³
0.72
Deliver
0.67
Workers
0.67
xon
0.66
zek
0.66
Employees
0.66
Activations Density 0.000%
No Known Activations
This feature has no known activations.