INDEX
Explanations
references to audience engagement and entertainment
New Auto-Interp
Negative Logits
rement
-0.17
978
-0.15
umper
-0.14
iek
-0.14
ÑĨей
-0.14
åģ
-0.14
tane
-0.14
æĮ
-0.14
828
-0.14
848
-0.14
POSITIVE LOGITS
Ñģвоим
0.19
les
0.18
capt
0.16
Ñģвоими
0.15
eah
0.15
posit
0.14
eyi
0.14
Ļ
0.14
imb
0.14
headlines
0.14
Activations Density 0.127%