INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.06
1:0.06
2:0.08
3:0.07
4:0.09
5:0.07
6:0.10
7:0.08
8:0.08
9:0.07
10:0.10
11:0.08
Negative Logits
acea
-1.82
agn
-1.76
changing
-1.73
ヤ
-1.69
aces
-1.65
changes
-1.65
ageddon
-1.64
filled
-1.60
angered
-1.57
changed
-1.55
POSITIVE LOGITS
Fn
1.58
))))
1.54
supervised
1.51
WATCHED
1.51
playthrough
1.49
surv
1.48
operator
1.46
unit
1.42
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
1.41
airflow
1.41
Activations Density 0.000%
No Known Activations
This feature has no known activations.