INDEX
    Explanations

    references to film directors and their works

    New Auto-Interp
    Negative Logits
    _triggered
    -0.16
    coni
    -0.15
    omen
    -0.15
    asca
    -0.15
    itori
    -0.15
    nodoc
    -0.14
    uese
    -0.14
     имÑĥ
    -0.14
    lesen
    -0.14
     ::=
    -0.14
    POSITIVE LOGITS
     Atom
    0.24
     Spike
    0.21
     Harmony
    0.21
    Atom
    0.21
     Äijạo
    0.19
     Ridley
    0.19
    liÄŁini
    0.18
    dir
    0.18
     Baz
    0.18
    æģ¯
    0.17
    Act Density 0.070%

    No Known Activations