INDEX
    Explanations

    numeric values and quantities

    New Auto-Interp
    Negative Logits
    lama
    -0.16
    oyer
    -0.16
    .SC
    -0.15
    AIM
    -0.15
    kol
    -0.14
    plug
    -0.14
    ached
    -0.14
     бÑĥ
    -0.14
    ardon
    -0.14
    elp
    -0.13
    POSITIVE LOGITS
    atas
    0.16
    anth
    0.16
    ané
    0.15
    rens
    0.14
    Verb
    0.14
    aklı
    0.14
    .jboss
    0.14
    axis
    0.13
    ish
    0.13
     aus
    0.13
    Act Density 0.172%

    No Known Activations