INDEX
    Explanations

    greetings and pleasantries

    New Auto-Interp
    Negative Logits
     तकरीबन
    0.38
     वैकल्पिक
    0.37
     annunci
    0.36
     metric
    0.36
     경우에는
    0.34
     gön
    0.34
     bagaimana
    0.34
     تُ
    0.34
    куса
    0.34
     sincerity
    0.34
    POSITIVE LOGITS
    Excellent
    0.53
     nice
    0.51
    Hi
    0.50
    Very
    0.48
    Hello
    0.48
    Nice
    0.47
     Excellent
    0.46
    不错的
    0.46
    hello
    0.45
     Nice
    0.45
    Act Density 0.010%

    No Known Activations