INDEX
    Explanations

    sentences expressing personal opinions or critiques about films

    New Auto-Interp
    Negative Logits
     Compatible
    -0.16
    appe
    -0.15
    UNK
    -0.15
    åŀ
    -0.15
    unk
    -0.15
    andas
    -0.14
    woord
    -0.14
    ansi
    -0.14
    öh
    -0.14
    ittest
    -0.14
    POSITIVE LOGITS
    acher
    0.17
    .cx
    0.17
    -await
    0.15
    zem
    0.14
    chw
    0.14
     воÑĢ
    0.14
     synchron
    0.14
    uling
    0.14
    achat
    0.14
    LAT
    0.14
    Act Density 0.094%

    No Known Activations