INDEX
    Explanations

    interrogative phrases or questions

    New Auto-Interp
    Negative Logits
     it
    -0.82
    It
    -0.63
    -0.57
     It
    -0.51
    He
    -0.51
    adə
    -0.50
    it
    -0.50
     ذلك
    -0.50
     bankası
    -0.47
    Everybody
    -0.47
    POSITIVE LOGITS
     we
    1.00
     they
    0.87
     you
    0.84
     AssemblyCulture
    0.82
    ']],
    0.78
     wij
    0.77
     wir
    0.76
    ')['
    0.73
    }';
    0.71
     THESE
    0.71
    Act Density 0.067%

    No Known Activations