INDEX
    Explanations

    forms of do and be in questions

    New Auto-Interp
    Negative Logits
    .*
    0.38
    رو
    0.37
    是他
    0.34
    .(*
    0.34
    iség
    0.33
    Count
    0.33
     ندارد
    0.33
    مة
    0.33
    Inner
    0.33
    *.
    0.32
    POSITIVE LOGITS
     it
    0.96
     they
    0.83
     we
    0.78
     you
    0.76
     he
    0.64
     this
    0.63
    ?),
    0.62
     the
    0.59
    ?).
    0.59
    ?)
    0.59
    Act Density 0.086%

    No Known Activations