INDEX
    Explanations

    "can" followed by potential action

    New Auto-Interp
    Negative Logits
    ua
    1.04
    ور
    0.96
    0.95
    Denne
    0.92
    ujuan
    0.91
     anciens
    0.87
    oma
    0.85
    ik
    0.84
    ANG
    0.84
    DM
    0.83
    POSITIVE LOGITS
    ס
    1.16
    ה
    1.09
    ه
    0.98
    an
    0.96
    0.95
    ע
    0.93
    ان
    0.91
    การ
    0.91
     de
    0.87
     on
    0.87
    Act Density 0.625%

    No Known Activations