INDEX
Explanations
references to media ratings and viewer engagement
New Auto-Interp
Negative Logits
usz
-0.16
ÃŃl
-0.15
los
-0.15
еÑģÑĮ
-0.15
Nin
-0.14
neither
-0.14
plus
-0.14
ubar
-0.14
´Ī
-0.13
olo
-0.13
POSITIVE LOGITS
apart
0.31
Apart
0.30
Apart
0.29
majority
0.26
Majority
0.23
Aside
0.21
aside
0.21
Aside
0.18
aside
0.17
éϤ
0.17
Activations Density 0.032%