INDEX
    Explanations

    negation phrases and expressions

    New Auto-Interp
    Negative Logits
    MLLoader
    -0.59
     Inscrivez
    -0.57
     ainfi
    -0.56
    cleros
    -0.54
     iſt
    -0.54
     Monfieur
    -0.54
     tatuagem
    -0.53
     rarity
    -0.52
    outheast
    -0.52
     saveiro
    -0.52
    POSITIVE LOGITS
     was
    0.85
    was
    0.68
    Was
    0.65
     Was
    0.65
     were
    0.63
     originally
    0.60
     buvo
    0.57
    Twas
    0.56
     था
    0.56
     WAS
    0.55
    Act Density 0.047%

    No Known Activations