INDEX
Explanations
references to films and their attributes
New Auto-Interp
Negative Logits
grace
-0.18
ipp
-0.16
Grace
-0.15
Boeh
-0.15
395
-0.14
Dro
-0.14
Bros
-0.14
hta
-0.14
-0.14
hton
-0.14
POSITIVE LOGITS
contres
0.20
DSL
0.15
jal
0.15
ARAM
0.15
vise
0.14
ioc
0.14
ninete
0.14
ALAR
0.14
вк
0.14
PTY
0.14
Activations Density 0.027%