INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
andel
-0.80
lation
-0.80
activation
-0.78
perse
-0.76
ppe
-0.76
essen
-0.74
je
-0.73
omorph
-0.71
angled
-0.70
schild
-0.70
POSITIVE LOGITS
removable
0.84
Vanity
0.69
NX
0.68
Torch
0.67
Outs
0.66
Vul
0.64
Umb
0.64
Characters
0.64
Barrett
0.63
playable
0.61
Activations Density 0.000%
No Known Activations
This feature has no known activations.