INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.09
1:0.05
2:0.08
3:0.08
4:0.10
5:0.10
6:0.07
7:0.07
8:0.07
9:0.07
10:0.08
11:0.07
Negative Logits
comr
-1.71
biscuits
-1.49
antioxid
-1.48
¯
-1.46
bribes
-1.45
monarch
-1.42
*)
-1.42
compliments
-1.41
phia
-1.41
unden
-1.41
POSITIVE LOGITS
yg
1.66
reshold
1.65
rb
1.61
ua
1.59
gger
1.59
iannopoulos
1.56
urst
1.55
scan
1.51
attr
1.50
bg
1.48
Activations Density 0.000%
No Known Activations
This feature has no known activations.