INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    겠지만
    0.47
     समझेंगे
    0.40
     Ведь
    0.39
    0.39
    दी
    0.38
    ]::-
    0.38
     kvůli
    0.37
     خان
    0.37
    知道
    0.36
    )+(
    0.36
    POSITIVE LOGITS
     wow
    1.03
     WOW
    1.03
     Wow
    0.96
    Wow
    0.91
    WOW
    0.89
    wow
    0.84
     Frankly
    0.79
     frankly
    0.76
     OMG
    0.70
     honestly
    0.68
    Act Density 0.003%

    No Known Activations