INDEX
Explanations
critiques of film narratives and their execution
New Auto-Interp
Negative Logits
hek
-0.15
doma
-0.13
imus
-0.13
üstü
-0.13
ecom
-0.13
_iterations
-0.13
heard
-0.12
ecd
-0.12
ä½
-0.12
ãĥ¼ãĥĸãĥ«
-0.12
POSITIVE LOGITS
deals
0.20
stars
0.19
centers
0.18
concerned
0.18
traff
0.18
features
0.18
pos
0.17
concerns
0.17
starred
0.17
hinges
0.17
Activations Density 0.125%