INDEX
    Explanations

    references to human emotions and societal issues

    New Auto-Interp
    Negative Logits
    unny
    -0.16
    .nih
    -0.14
    gv
    -0.14
    vrd
    -0.14
    á»ı
    -0.14
     COPYING
    -0.14
    Ưá»
    -0.14
    .ib
    -0.13
    bourne
    -0.13
    _override
    -0.13
    POSITIVE LOGITS
    usat
    0.17
    Ø´ÙĪ
    0.14
    isto
    0.14
    .Win
    0.14
     Haj
    0.14
     ang
    0.14
     ownership
    0.14
     Jar
    0.14
    ataka
    0.14
    ativ
    0.13
    Act Density 0.389%

    No Known Activations