INDEX
    Explanations

    anomalous or unusual situations and questions

    New Auto-Interp
    Negative Logits
    ja
    -0.17
    LAG
    -0.16
    álo
    -0.15
    лада
    -0.15
    ظÙĩ
    -0.15
    æģIJ
    -0.14
    ظ
    -0.14
    lia
    -0.14
    AGON
    -0.13
    ami
    -0.13
    POSITIVE LOGITS
     nowhere
    0.18
    _None
    0.17
     none
    0.16
     None
    0.16
    wald
    0.15
     neither
    0.14
    chl
    0.14
     nobody
    0.14
    ousse
    0.14
    isz
    0.14
    Act Density 0.093%

    No Known Activations