INDEX
Explanations
references to film directors and their works
New Auto-Interp
Negative Logits
_triggered
-0.16
coni
-0.15
omen
-0.15
asca
-0.15
itori
-0.15
nodoc
-0.14
uese
-0.14
имÑĥ
-0.14
lesen
-0.14
::=
-0.14
POSITIVE LOGITS
Atom
0.24
Spike
0.21
Harmony
0.21
Atom
0.21
Äijạo
0.19
Ridley
0.19
liÄŁini
0.18
dir
0.18
Baz
0.18
æģ¯
0.17
Activations Density 0.070%