INDEX
    Explanations

    titles of TV shows and movies

    New Auto-Interp
    Negative Logits
     dep
    -0.15
    kos
    -0.14
    Ä
    -0.14
     guaranteed
    -0.14
     Cos
    -0.14
     gang
    -0.13
    ederland
    -0.13
     nerv
    -0.13
     orth
    -0.13
     garn
    -0.13
    POSITIVE LOGITS
    _tE
    0.18
    _tA
    0.17
    _mE
    0.17
     nrw
    0.15
    _mD
    0.15
    _tF
    0.15
    _tD
    0.15
    ãģĭãģĹ
    0.15
    _mB
    0.15
    _tC
    0.15
    Act Density 0.037%

    No Known Activations