INDEX
Explanations
phrases related to advertisements or viewer engagement
New Auto-Interp
Head Attr Weights
0:0.12
1:0.05
2:0.07
3:0.08
4:0.05
5:0.11
6:0.08
7:0.03
8:0.09
9:0.12
10:0.10
11:0.05
Negative Logits
rower
-1.46
mechanically
-1.20
finally
-1.20
neglig
-1.20
contracted
-1.13
contracting
-1.12
settled
-1.12
fortunate
-1.11
hereby
-1.08
gene
-1.07
POSITIVE LOGITS
adders
1.47
rooft
1.31
interstitial
1.28
Iv
1.27
VIDEOS
1.22
裏�
1.20
cuts
1.18
smart
1.17
SPONSORED
1.16
�
1.16
Activations Density 0.002%