INDEX
    Explanations

    instances of humor and irony in language

    New Auto-Interp
    Negative Logits
    ajo
    -0.17
    ilk
    -0.15
    girls
    -0.15
    ÑģÑĤа
    -0.15
    Ñģм
    -0.14
    woff
    -0.14
    venta
    -0.14
    eya
    -0.14
    ķĮ
    -0.13
    ILA
    -0.13
    POSITIVE LOGITS
     animate
    0.17
    .uniform
    0.15
    umpt
    0.14
    Ú¾
    0.14
    otron
    0.14
    IFORM
    0.14
    åĿĢ
    0.14
    _CBC
    0.14
    主
    0.14
     uniform
    0.14
    Act Density 0.115%

    No Known Activations