INDEX
    Explanations

    questions and statements expressing confusion or seeking assistance

    New Auto-Interp
    Negative Logits
    æĮ¯ãĤĬ
    -0.15
    rimon
    -0.15
    ulos
    -0.14
    èŃ
    -0.14
    ulan
    -0.14
    orn
    -0.14
     IMPLIED
    -0.14
    ÑĤаж
    -0.14
    hp
    -0.14
    Bindable
    -0.14
    POSITIVE LOGITS
     wrong
    0.24
     missing
    0.23
     Wrong
    0.20
     Missing
    0.19
    Missing
    0.19
     mistake
    0.18
     overlooking
    0.18
     WRONG
    0.18
     mistakes
    0.17
    Wrong
    0.17
    Act Density 0.034%

    No Known Activations