INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     unchanged
    -0.82
    Attendees
    -0.81
     nadal
    -0.81
    //});
    -0.80
     again
    -0.80
    neté
    -0.79
    нін
    -0.77
    Біографія
    -0.76
    //};
    -0.74
    -------------</
    -0.73
    POSITIVE LOGITS
     jumped
    2.17
     join
    2.14
     jumping
    2.02
     joining
    1.98
     joins
    1.94
     jump
    1.92
     jumps
    1.80
     Jumping
    1.76
     bandwagon
    1.73
    jumping
    1.73
    Act Density 0.063%

    No Known Activations