INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
artisan
-0.72
mand
-0.72
esters
-0.65
otyp
-0.64
orally
-0.63
owsky
-0.61
osate
-0.60
itudinal
-0.60
enei
-0.60
Written
-0.59
POSITIVE LOGITS
ITNESS
0.69
Bauer
0.66
here
0.66
ité
0.66
isons
0.66
illac
0.65
iliary
0.64
illon
0.64
Marqu
0.63
ÃŁ
0.61
Activations Density 0.000%
No Known Activations
This feature has no known activations.