INDEX
    Explanations

    possessive pronouns and articles after commas

    New Auto-Interp
    Negative Logits
     carried
    0.56
    MAN
    0.55
    From
    0.54
    Works
    0.51
     இருந்தது
    0.51
    This
    0.50
    Detailed
    0.49
    worked
    0.48
     triggered
    0.48
    сима
    0.48
    POSITIVE LOGITS
     your
    0.83
     our
    0.80
     the
    0.69
     vaš
    0.67
     vaše
    0.67
     আপনার
    0.66
    私たちの
    0.66
     their
    0.65
     നിങ്ങളുടെ
    0.65
     ஒரு
    0.64
    Act Density 0.001%

    No Known Activations