INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Were
    0.46
     Career
    0.46
    </i>
    0.45
     Appetite
    0.42
     Chips
    0.41
    0.40
     were
    0.40
    Were
    0.40
     Appliance
    0.40
    </b>
    0.40
    POSITIVE LOGITS
    ತು
    0.45
    accouchement
    0.45
    Ş
    0.43
    v
    0.42
    тной
    0.41
    taine
    0.41
    ಸ್ತ
    0.41
    baş
    0.40
    allaitement
    0.40
    بی
    0.40
    Act Density 0.001%

    No Known Activations