INDEX
Explanations
genres of films and shows
New Auto-Interp
Negative Logits
spos
-0.15
xmm
-0.15
«a
-0.14
novels
-0.14
StackTrace
-0.14
CPP
-0.13
APH
-0.13
summ
-0.13
ALE
-0.13
destin
-0.13
POSITIVE LOGITS
Romance
0.21
Action
0.18
Action
0.18
Sci
0.17
HDC
0.17
964
0.17
Fant
0.17
antasy
0.17
Fantasy
0.16
-action
0.16
Activations Density 0.013%