INDEX
    Explanations

    those followed by who/that/in

    New Auto-Interp
    Negative Logits
    er
    0.85
    ות
    0.75
    es
    0.74
    ad
    0.71
    ার
    0.67
    0.66
     hypoglycemia
    0.64
    0.63
     to
    0.62
     نے
    0.60
    POSITIVE LOGITS
    ه
    0.84
    ని
    0.69
    ρες
    0.64
    a
    0.64
     pesky
    0.63
    0.63
    ส์
    0.61
    פר
    0.60
    ರು
    0.58
    lene
    0.57
    Act Density 0.030%

    No Known Activations