INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
PIN
-0.81
spect
-0.78
Factor
-0.75
externalActionCode
-0.75
Ring
-0.75
Narr
-0.72
membr
-0.70
WT
-0.70
Phill
-0.69
Redd
-0.69
POSITIVE LOGITS
ledge
0.68
adam
0.66
Berkeley
0.66
Neo
0.65
ilo
0.64
eve
0.63
Bunker
0.63
andowski
0.61
hips
0.61
Alm
0.61
Activations Density 0.000%
No Known Activations
This feature has no known activations.