INDEX
Explanations
references to visual media and the act of watching films or shows
New Auto-Interp
Negative Logits
zell
-0.17
legg
-0.15
Visibility
-0.15
lector
-0.15
anonymity
-0.14
UPI
-0.14
/Public
-0.14
Ãło
-0.14
ró
-0.14
quires
-0.14
POSITIVE LOGITS
cover
0.22
aloud
0.21
relig
0.19
unfold
0.19
religious
0.19
forwards
0.18
backwards
0.18
backward
0.18
/watch
0.18
attent
0.18
Activations Density 0.152%