INDEX
    Explanations

    descriptive adjectives

    New Auto-Interp
    Negative Logits
     his
    -0.08
    His
    -0.07
    Density
    -0.07
    ecause
    -0.06
     our
    -0.06
     this
    -0.06
     Ра
    -0.06
     Shooter
    -0.06
     temptation
    -0.06
     dare
    -0.06
    POSITIVE LOGITS
     Wrestle
    0.07
    ѓ
    0.07
    ?",
    0.06
    agle
    0.06
     Sự
    0.06
     ReturnValue
    0.06
    leyin
    0.06
     *)[
    0.06
     finans
    0.06
    olves
    0.06
    Act Density 0.128%

    No Known Activations