INDEX
    Explanations

    variations and distinctions in contexts

    New Auto-Interp
    Negative Logits
     vectorielle
    -1.09
     vectorielles
    -0.93
     étoient
    -0.89
     nucléaire
    -0.89
     umana
    -0.89
     mijne
    -0.86
     umane
    -0.84
     complètes
    -0.83
     fumée
    -0.82
     humaines
    -0.82
    POSITIVE LOGITS
     different
    1.55
    Different
    1.48
     Different
    1.41
    different
    1.38
     DIFFERENT
    1.31
     difer
    1.16
     diferente
    0.99
     diff
    0.98
     dif
    0.97
    不同
    0.96
    Act Density 0.114%

    No Known Activations