INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
SHAR
-0.66
thought
-0.65
avorite
-0.65
plates
-0.65
strength
-0.64
FANT
-0.63
plate
-0.62
shadow
-0.61
Storm
-0.60
Ballistic
-0.60
POSITIVE LOGITS
arity
0.73
opian
0.71
Cosponsors
0.67
Fiat
0.65
ogle
0.62
environment
0.61
ulz
0.61
ually
0.61
intendo
0.59
ido
0.58
Activations Density 0.000%
No Known Activations
This feature has no known activations.