INDEX
    Explanations

    phrases indicating established concepts or statuses

    New Auto-Interp
    Negative Logits
    pson
    -0.18
    еÑģÑı
    -0.15
    ubl
    -0.15
    ilers
    -0.15
    åģı
    -0.15
    hoa
    -0.15
    onation
    -0.15
    .fx
    -0.14
    edis
    -0.14
    .hardware
    -0.14
    POSITIVE LOGITS
    isko
    0.16
     ÑĩеÑĢ
    0.15
     Bett
    0.15
     Gir
    0.14
    ado
    0.14
     Wir
    0.14
    ummer
    0.14
    osome
    0.14
    ocos
    0.14
    uron
    0.14
    Act Density 0.118%

    No Known Activations