INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     It
    -1.88
     acepción
    -1.65
     it
    -1.64
     its
    -1.50
    </h6>
    -1.50
    </h5>
    -1.48
    ↵↵
    -1.41
     típicos
    -1.40
     motivadoras
    -1.38
     Wahrnehmung
    -1.38
    POSITIVE LOGITS
     tabac
    1.47
    几十
    1.47
     manteau
    1.43
     saba
    1.42
     goût
    1.39
     paillettes
    1.37
     pavillon
    1.37
    1.35
     prins
    1.35
     congé
    1.34
    Act Density 0.020%

    No Known Activations