INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
elin
-0.90
sels
-0.81
itars
-0.74
atin
-0.73
zanne
-0.71
â̲
-0.71
egal
-0.69
anders
-0.69
elsen
-0.67
ilty
-0.66
POSITIVE LOGITS
Carney
0.74
BaseType
0.72
Bowen
0.67
Piper
0.66
Ambro
0.66
Yar
0.65
Shore
0.65
hoff
0.64
Compan
0.63
Brist
0.63
Activations Density 0.000%
No Known Activations
This feature has no known activations.