INDEX
    Explanations

    instances of strong affirmations or enthusiastic expressions

    New Auto-Interp
    Negative Logits
    Życiorys
    -0.59
    MLLoader
    -0.50
    ьаж
    -0.48
     utafitiHapana
    -0.41
     Numerade
    -0.39
    öll
    -0.38
     serez
    -0.38
     downvotes
    -0.37
    いわゆる
    -0.37
    
    -0.37
    POSITIVE LOGITS
     these
    0.75
    These
    0.73
    Bonus
    0.68
     concludes
    0.67
    these
    0.65
     These
    0.64
    Conclusion
    0.62
     theſe
    0.60
    BONUS
    0.60
    Honorable
    0.59
    Act Density 0.004%

    No Known Activations