INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
perspect
-0.74
racks
-0.73
Afee
-0.72
Canaver
-0.71
Akin
-0.67
nets
-0.62
stakes
-0.60
dismant
-0.59
awa
-0.59
sorts
-0.58
POSITIVE LOGITS
itudinal
0.81
vid
0.76
gin
0.72
pid
0.72
gary
0.68
appa
0.68
ibrary
0.66
gex
0.66
algia
0.64
hran
0.63
Activations Density 0.000%
No Known Activations
This feature has no known activations.