INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    davis
    -0.44
    UCN
    -0.44
     eventual
    -0.41
    TCL
    -0.40
    ثيق
    -0.39
     daly
    -0.39
     Buch
    -0.38
     Davis
    -0.38
    ıntı
    -0.38
     Dickson
    -0.37
    POSITIVE LOGITS
     bored
    1.96
     Bored
    1.86
    Bored
    1.80
    bored
    1.65
     boredom
    1.48
    Bore
    1.00
     Bore
    0.84
    bore
    0.82
     abur
    0.80
    无聊
    0.78
    Act Density 0.003%

    No Known Activations