INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     increa
    -0.85
     fortn
    -0.82
     affor
    -0.80
     shenan
    -0.77
     strick
    -0.76
     michelin
    -0.75
     volunte
    -0.73
     attemp
    -0.73
     tucson
    -0.72
     jurassic
    -0.71
    POSITIVE LOGITS
    <bos>
    0.66
    Ten
    0.64
    tenth
    0.64
     Ten
    0.57
     diez
    0.53
     ten
    0.53
    OneToMany
    0.53
     simplifié
    0.52
    October
    0.52
    ponses
    0.50
    Act Density 0.175%

    No Known Activations