INDEX
Explanations
references to television shows and their ratings
New Auto-Interp
Negative Logits
mai
-0.17
adia
-0.17
NetMessage
-0.16
ondheim
-0.15
δÏģο
-0.15
orge
-0.15
uktur
-0.15
warts
-0.15
awan
-0.14
idlo
-0.14
POSITIVE LOGITS
Jose
0.17
alg
0.17
Je
0.16
Moran
0.16
Jose
0.15
Tube
0.15
Gos
0.15
Wol
0.15
Je
0.15
ÌĨ
0.14
Activations Density 0.040%