INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     жест
    -0.07
     اذ
    -0.06
    ерв
    -0.06
     relieved
    -0.06
     Host
    -0.06
     Bar
    -0.06
     chapter
    -0.06
    oyo
    -0.05
     Harbor
    -0.05
     plotting
    -0.05
    POSITIVE LOGITS
     лі
    0.08
     freeing
    0.07
     Не
    0.07
    -ranging
    0.06
    /trans
    0.06
    .emplace
    0.06
     sparing
    0.06
    _micro
    0.06
     jPanel
    0.06
    rough
    0.06
    Act Density 0.008%

    No Known Activations