INDEX
    Explanations

    common phrases or interjections, particularly those that involve pauses or brief grammatical transitions

    New Auto-Interp
    Negative Logits
    ä¹İ
    -0.17
    mina
    -0.16
    irling
    -0.15
    è¸
    -0.14
    اة
    -0.14
    orna
    -0.14
    teenth
    -0.14
    atan
    -0.14
    dit
    -0.14
    kiye
    -0.14
    POSITIVE LOGITS
     Joi
    0.14
    oyo
    0.14
    Âĺ
    0.14
    atel
    0.14
    abus
    0.14
    à¸ŀร
    0.14
    Pros
    0.14
    ichel
    0.13
     addCriterion
    0.13
    ongs
    0.13
    Act Density 0.083%

    No Known Activations