INDEX
Explanations
mentions of film or entertainment franchises and their characteristics
New Auto-Interp
Negative Logits
enden
-0.15
gren
-0.15
-elect
-0.14
heets
-0.14
ths
-0.14
613
-0.14
err
-0.13
reeze
-0.13
394
-0.13
oten
-0.13
POSITIVE LOGITS
illo
0.16
dil
0.15
adÃŃ
0.14
uttle
0.14
оÑģÑĥд
0.14
Ð¡Ðł
0.14
æĮ
0.14
audi
0.14
ActionTypes
0.13
par
0.13
Activations Density 0.040%