INDEX
    Explanations

    references to films and their classifications

    New Auto-Interp
    Negative Logits
     personals
    -0.17
    ipherals
    -0.16
    /topics
    -0.15
    ome
    -0.14
    assin
    -0.14
    ska
    -0.14
    emmel
    -0.14
     возÑĢаÑģÑĤ
    -0.14
    onn
    -0.14
    OME
    -0.14
    POSITIVE LOGITS
     Fil
    0.23
    Fil
    0.22
    _fil
    0.21
    fil
    0.20
     fil
    0.18
     English
    0.18
    .fil
    0.17
     films
    0.17
    English
    0.17
    ivos
    0.17
    Act Density 0.022%

    No Known Activations