INDEX
    Explanations

    names of directors and films

    New Auto-Interp
    Negative Logits
     
    -0.32
     S
    -0.28
     g
    -0.27
     c
    -0.27
     A
    -0.26
     W
    -0.26
     w
    -0.26
     "
    -0.26
     C
    -0.26
     (
    -0.26
    POSITIVE LOGITS
    ÃŃ
    0.30
    á
    0.29
    ý
    0.28
    ÃŃn
    0.27
    ÃŃž
    0.25
    ÄĽ
    0.25
    ÃŃd
    0.25
    ÃŃr
    0.25
    ů
    0.24
    ÃŃÅĻ
    0.24
    Act Density 0.026%

    No Known Activations