INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     contamos
    -0.85
     tratando
    -0.83
    Agreed
    -0.81
     impulsar
    -0.81
    ボーナス
    -0.79
     superfície
    -0.79
    étant
    -0.77
     cowards
    -0.76
     kalimat
    -0.75
    从事
    -0.75
    POSITIVE LOGITS
    bite
    1.34
    proofing
    1.24
    proof
    1.24
     bites
    1.22
    tracks
    1.10
     waves
    1.09
     bytes
    1.06
    stage
    1.05
    bites
    1.05
    cloud
    1.03
    Act Density 0.025%

    No Known Activations