INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
wrong
-0.75
Tsukuyomi
-0.74
blackout
-0.72
Nin
-0.71
Bunker
-0.66
1850
-0.66
sunset
-0.64
Seah
-0.64
incorrect
-0.63
surpr
-0.62
POSITIVE LOGITS
acco
0.76
phia
0.72
zag
0.72
mbuds
0.70
oug
0.68
ruit
0.66
itsch
0.66
geon
0.64
azar
0.64
uce
0.64
Activations Density 0.000%
No Known Activations
This feature has no known activations.