INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
izophren
-0.81
kees
-0.76
chedel
-0.74
Meditation
-0.73
netflix
-0.70
ouf
-0.69
pac
-0.68
fat
-0.67
onite
-0.67
Awakens
-0.67
POSITIVE LOGITS
Nanto
0.76
Suz
0.70
Cly
0.69
Cerberus
0.64
Becky
0.64
Seraph
0.63
Diane
0.62
Leopard
0.62
Leigh
0.62
Meet
0.62
Activations Density 0.000%
No Known Activations
This feature has no known activations.