INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Tim
    -0.09
    казал
    -0.08
    td
    -0.07
    Roger
    -0.07
    Alan
    -0.07
    fter
    -0.07
    دخل
    -0.07
    cod
    -0.07
    auled
    -0.07
    ally
    -0.07
    POSITIVE LOGITS
     Alta
    0.08
     welt
    0.07
    _bounds
    0.07
     awaiting
    0.07
    Attention
    0.07
     bew
    0.07
    _above
    0.07
     unos
    0.06
    **,
    0.06
     Renewable
    0.06
    Act Density 0.002%

    No Known Activations