INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
erion
-0.82
ether
-0.79
intend
-0.69
egu
-0.68
GoldMagikarp
-0.65
course
-0.65
amphetamine
-0.65
iction
-0.65
adoption
-0.64
hetamine
-0.64
POSITIVE LOGITS
Skies
0.70
Roz
0.70
Quote
0.69
Tyrann
0.67
Democr
0.67
hawk
0.64
Laf
0.63
Hue
0.61
attRot
0.61
Maz
0.61
Activations Density 0.000%
No Known Activations
This feature has no known activations.