INDEX
Explanations
references to films and their elements such as characters, settings, and themes
New Auto-Interp
Negative Logits
emen
-0.15
erta
-0.15
earing
-0.15
ippers
-0.14
ur
-0.14
eddar
-0.14
eg
-0.14
Johan
-0.14
spl
-0.14
abar
-0.13
POSITIVE LOGITS
ukkit
0.16
indre
0.16
Nun
0.16
_REUSE
0.16
hay
0.15
лаÑĩ
0.15
λÏİ
0.15
ác
0.15
Grill
0.14
à¸¸à¸Ľ
0.14
Activations Density 0.234%