INDEX
    Explanations

    pronouns/prepositions

    New Auto-Interp
    Negative Logits
    .ListBox
    -0.07
    ْ
    -0.07
    (argv
    -0.07
    .Inject
    -0.07
    _cv
    -0.06
    -0.06
     матери
    -0.06
    Pinterest
    -0.06
    ‌دهد
    -0.06
    .lazy
    -0.06
    POSITIVE LOGITS
    ANCES
    0.06
    ávací
    0.06
     deer
    0.06
    _CHK
    0.06
    orus
    0.06
     questi
    0.06
     Antonio
    0.06
     unserialize
    0.06
     kron
    0.06
     чому
    0.06
    Act Density 0.309%

    No Known Activations