INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    _case
    -0.08
    older
    -0.07
    _dim
    -0.07
     בז
    -0.07
     대한
    -0.07
    achd
    -0.07
    path
    -0.07
    mun
    -0.07
    iect
    -0.07
    provements
    -0.07
    POSITIVE LOGITS
     동안
    0.10
     boyunca
    0.10
    동안
    0.09
     hinweg
    0.09
    多久
    0.09
     పాటు
    0.09
    .Duration
    0.09
    ,然后
    0.08
     مدت
    0.08
     Duration
    0.08
    Act Density 0.103%

    No Known Activations