INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
Eva
-0.93
eca
-0.86
kson
-0.85
gress
-0.82
yg
-0.80
Roz
-0.79
Sasha
-0.78
seiz
-0.76
Roose
-0.75
ovych
-0.74
POSITIVE LOGITS
ional
0.75
liner
0.73
managers
0.70
manager
0.68
oke
0.66
leverage
0.66
matic
0.65
Engineers
0.64
management
0.63
powers
0.62
Activations Density 0.000%
No Known Activations
This feature has no known activations.