INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     going
    -0.07
     і
    -0.07
    ัม
    -0.07
    <Account
    -0.07
    .psi
    -0.06
    -0.06
    ICODE
    -0.06
     preferably
    -0.06
     отк
    -0.06
     TLabel
    -0.06
    POSITIVE LOGITS
     make
    0.11
     made
    0.10
    Make
    0.08
     making
    0.08
     makes
    0.07
     MAKE
    0.07
     Make
    0.07
    !
    0.07
     Makes
    0.06
    _make
    0.06
    Act Density 0.037%

    No Known Activations