INDEX
    Explanations

    the word "were" and its variations

    New Auto-Interp
    Negative Logits
    ax
    -0.69
    шой
    -0.66
    has
    -0.63
    hadiran
    -0.60
    pat
    -0.58
     FAS
    -0.57
     pat
    -0.57
     aprobó
    -0.56
    まと
    -0.55
    FAS
    -0.55
    POSITIVE LOGITS
     were
    1.38
    Were
    1.33
     Were
    1.31
    were
    1.23
     WERE
    1.12
    weren
    1.02
     weren
    0.98
     WER
    0.95
    étaient
    0.93
     wer
    0.89
    Act Density 0.219%

    No Known Activations