INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
kees
-0.78
CLASSIFIED
-0.74
fair
-0.73
KS
-0.69
sup
-0.65
angs
-0.65
ives
-0.64
inqu
-0.63
adr
-0.63
fred
-0.62
POSITIVE LOGITS
inburgh
0.79
performer
0.69
ourney
0.69
NCT
0.68
stained
0.66
rison
0.65
rehearsal
0.63
plumbing
0.63
Op
0.62
hyde
0.62
Activations Density 0.000%
No Known Activations
This feature has no known activations.