INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.08
1:0.08
2:0.09
3:0.07
4:0.10
5:0.09
6:0.09
7:0.08
8:0.06
9:0.07
10:0.07
11:0.08
Negative Logits
loss
-3.10
idepress
-3.10
doms
-3.02
anza
-3.01
netflix
-2.91
Moff
-2.86
edom
-2.84
Fail
-2.79
Poe
-2.78
Kill
-2.77
POSITIVE LOGITS
ingred
3.34
Catalog
3.07
Creation
2.97
********************************
2.92
paperwork
2.82
bou
2.76
conserv
2.71
loader
2.68
Construction
2.67
Construction
2.64
Activations Density 0.000%
No Known Activations
This feature has no known activations.