INDEX
Explanations
references to specific animated television series and their related content
New Auto-Interp
Negative Logits
ova
-0.15
umas
-0.15
erez
-0.15
908
-0.15
overhead
-0.14
ippy
-0.14
808
-0.14
lg
-0.14
817
-0.14
ãģ¨ãģĵãĤį
-0.14
POSITIVE LOGITS
Simpsons
0.32
Simpson
0.31
Springfield
0.30
Homer
0.28
simp
0.26
Bart
0.24
simp
0.21
Couch
0.21
(simp
0.20
couch
0.19
Activations Density 0.043%