INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     жив
    -0.08
    vente
    -0.08
     можно
    -0.08
    אי
    -0.08
     hous
    -0.08
    Fir
    -0.07
     кні
    -0.07
     fredag
    -0.07
     можна
    -0.07
     Confederate
    -0.07
    POSITIVE LOGITS
    opts
    0.09
    0.08
     plaus
    0.08
    ugt
    0.08
    0.08
     stro
    0.07
    -opt
    0.07
    arras
    0.07
    _pending
    0.07
    warz
    0.07
    Act Density 0.000%

    No Known Activations