INDEX
    Explanations

    accents and other languages

    New Auto-Interp
    Negative Logits
     is
    0.73
     it
    0.70
     on
    0.67
     It
    0.64
     obat
    0.63
     PTSD
    0.62
    ুরী
    0.61
     dosen
    0.61
     gourd
    0.57
     an
    0.56
    POSITIVE LOGITS
    ون
    0.82
    ه
    0.76
    áno
    0.75
    á
    0.75
    ز
    0.72
    ні
    0.71
    تش
    0.71
    ет
    0.71
    é
    0.70
    تين
    0.69
    Act Density 0.000%

    No Known Activations