INDEX
    Explanations

    specific actions or processes related to growth, introduction, or assessment

    New Auto-Interp
    Negative Logits
     a
    -0.28
    -0.27
     consequently
    -0.26
    比べて
    -0.25
     الع
    -0.25
     even
    -0.25
    t
    -0.24
     q
    -0.24
     arrep
    -0.24
     necessarily
    -0.23
    POSITIVE LOGITS
    uxxxx
    1.05
     препратки
    0.98
     للمعارف
    0.95
     queſta
    0.95
    ロウィン
    0.95
     ſind
    0.88
     témoig
    0.88
     ostavi
    0.87
     typelib
    0.87
     zwiſchen
    0.86
    Act Density 0.109%

    No Known Activations