INDEX
    Explanations

    acceptable and suitable

    New Auto-Interp
    Negative Logits
    im
    0.91
     nem
    0.77
    0.77
     lt
    0.76
     inteligente
    0.76
    0.75
     ali
    0.74
    ا
    0.74
    microsoft
    0.73
    -]+
    0.73
    POSITIVE LOGITS
    с
    1.02
     thisStudent
    1.01
    νης
    0.93
    роме
    0.93
    0.91
    для
    0.88
     нада
    0.87
    <unused149>
    0.87
    redditmedia
    0.86
    ف
    0.85
    Act Density 0.330%

    No Known Activations