INDEX
Explanations
references to popular television shows and their cultural implications
New Auto-Interp
Negative Logits
ervo
-0.16
ivi
-0.16
едак
-0.15
.codec
-0.14
vod
-0.14
erli
-0.14
.Transactional
-0.14
afort
-0.13
lemn
-0.13
agrant
-0.13
POSITIVE LOGITS
sitcom
0.32
Rose
0.28
Sit
0.24
Rose
0.23
ABC
0.22
comedy
0.22
sit
0.22
Sit
0.21
Comedy
0.20
ABC
0.20
Activations Density 0.064%