INDEX
    Explanations

    punctuation marks and special characters

    New Auto-Interp
    Negative Logits
     gentes
    -0.60
     Partagez
    -0.57
    わかった
    -0.57
     forgiven
    -0.57
     blooming
    -0.56
     charging
    -0.56
     resumption
    -0.56
     tandis
    -0.55
    fresh
    -0.55
    folding
    -0.55
    POSITIVE LOGITS
     will
    1.17
     has
    1.16
     was
    1.10
     is
    1.07
     could
    1.05
     had
    1.04
     would
    1.03
     have
    1.03
     also
    0.98
     must
    0.98
    Act Density 0.313%

    No Known Activations