INDEX
    Explanations

    scientific texts

    New Auto-Interp
    Negative Logits
     evolved
    -0.07
     borrowing
    -0.07
     slim
    -0.07
    lien
    -0.06
    [H
    -0.06
    okens
    -0.06
     cosmetics
    -0.06
    EM
    -0.06
    .Gradient
    -0.06
     тем
    -0.06
    POSITIVE LOGITS
    Continue
    0.07
    ै↵
    0.06
    Primary
    0.06
    jets
    0.06
    ples
    0.06
    queen
    0.06
     اليمن
    0.06
     Freud
    0.06
    neo
    0.06
     noop
    0.06
    Act Density 0.006%

    No Known Activations