INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    THERS
    -0.87
    他不
    -0.83
     seorang
    -0.83
    ների
    -0.79
    GMP
    -0.78
    erapeutic
    -0.78
     なり
    -0.78
    uya
    -0.77
     someone
    -0.77
    rators
    -0.77
    POSITIVE LOGITS
     positions
    1.16
     quarterback
    1.09
     Positions
    1.06
    Positions
    1.03
     position
    1.03
     center
    1.00
    positions
    0.98
     cornerback
    0.96
     centre
    0.90
     posisi
    0.89
    Act Density 0.030%

    No Known Activations