INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.09
1:0.06
2:0.09
3:0.08
4:0.07
5:0.07
6:0.08
7:0.07
8:0.09
9:0.08
10:0.08
11:0.08
Negative Logits
orns
-1.68
Prin
-1.64
Sponsor
-1.63
Tags
-1.62
recipient
-1.60
azo
-1.58
aron
-1.57
paren
-1.57
ipples
-1.55
uer
-1.55
POSITIVE LOGITS
Hitchcock
1.75
kaya
1.71
?'"
1.70
rium
1.62
icz
1.61
yi
1.60
rio
1.58
Talent
1.57
acted
1.55
posterior
1.54
Activations Density 0.000%
No Known Activations
This feature has no known activations.