INDEX
    Explanations

    expressions of gratitude or thanks

    New Auto-Interp
    Negative Logits
     Hurricanes
    -0.84
    ago
    -0.72
     Samoa
    -0.69
    Dest
    -0.67
    女
    -0.66
     Loft
    -0.64
    abad
    -0.63
     blaze
    -0.60
     PW
    -0.60
     Ultron
    -0.59
    POSITIVE LOGITS
     Thanks
    1.16
    giving
    1.02
    Thanks
    0.96
    udos
    0.91
    gment
    0.90
    thanks
    0.87
     thanks
    0.87
    reenshots
    0.85
    awaru
    0.83
     Credits
    0.82
    Act Density 0.011%

    No Known Activations