INDEX
    Explanations

    references to different genres and categories of films

    New Auto-Interp
    Negative Logits
    onn
    -0.17
    imore
    -0.17
    assin
    -0.16
     personals
    -0.15
    VR
    -0.15
    emmel
    -0.15
     Downs
    -0.14
    âĶĶ
    -0.14
    ometown
    -0.14
    utters
    -0.14
    POSITIVE LOGITS
    Fil
    0.24
     Fil
    0.23
    _fil
    0.21
     fil
    0.19
    fil
    0.18
     English
    0.18
    .fil
    0.17
     sound
    0.17
     Films
    0.16
    sound
    0.16
    Act Density 0.019%

    No Known Activations