INDEX
Explanations
references to specific films and their characteristics
New Auto-Interp
Negative Logits
ç¤
-0.15
cak
-0.14
TEGER
-0.14
à¸ĩาà¸Ļ
-0.14
stripslashes
-0.14
.lst
-0.14
posables
-0.14
lineman
-0.14
dae
-0.14
rint
-0.13
POSITIVE LOGITS
thr
0.27
western
0.27
thriller
0.24
thrill
0.23
Thr
0.23
Thr
0.22
soap
0.20
western
0.20
serial
0.20
thr
0.20
Activations Density 0.070%