INDEX
    Explanations

    frequent and common nouns and verbs in text

    New Auto-Interp
    Negative Logits
    TRA
    -0.17
    εÏħ
    -0.17
    ÏĦοκ
    -0.15
     Dil
    -0.15
    rier
    -0.14
    ials
    -0.14
    UMB
    -0.14
     Parti
    -0.14
    elop
    -0.14
    umber
    -0.14
    POSITIVE LOGITS
    yleft
    0.17
    å¼ı
    0.15
    úsqueda
    0.15
    rray
    0.14
    stretch
    0.14
    Ñıн
    0.14
    532
    0.14
    .scala
    0.14
    åħĥ
    0.13
    shan
    0.13
    Act Density 0.001%

    No Known Activations