INDEX
    Explanations

    indicating prior state or action

    New Auto-Interp
    Negative Logits
    0.43
     Zet
    0.40
    0.40
    етра
    0.40
    a
    0.39
     tanıt
    0.39
    inação
    0.38
     spoj
    0.38
     frapp
    0.37
     ricorda
    0.37
    POSITIVE LOGITS
     pre
    1.33
    Pre
    1.22
     Pre
    1.18
    pre
    1.17
     प्री
    1.09
     пре
    1.04
    0.90
     preorder
    0.80
     preamp
    0.78
     pré
    0.77
    Act Density 0.052%

    No Known Activations