INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.09
1:0.06
2:0.09
3:0.08
4:0.08
5:0.08
6:0.08
7:0.08
8:0.08
9:0.07
10:0.07
11:0.09
Negative Logits
pictured
-1.71
dashed
-1.69
ital
-1.63
circled
-1.59
chid
-1.57
boxed
-1.56
boo
-1.49
Cele
-1.49
trailed
-1.47
HH
-1.46
POSITIVE LOGITS
��
1.75
fundament
1.69
ゴン
1.68
Quit
1.66
arrang
1.63
Enabled
1.61
CONTR
1.58
gments
1.58
sugg
1.56
edom
1.56
Activations Density 0.000%
No Known Activations
This feature has no known activations.