INDEX
    Explanations

    expressions of gratitude or thanks

    expressions of gratitude

    New Auto-Interp
    Negative Logits
     projecting
    -0.78
     Osc
    -0.72
    inese
    -0.67
     indo
    -0.67
     projected
    -0.64
     helicop
    -0.64
     deviation
    -0.62
     unprotected
    -0.62
     inhabited
    -0.62
     inhab
    -0.61
    POSITIVE LOGITS
    gements
    1.03
    gments
    0.96
    giving
    0.92
    ifully
    0.86
     acknowled
    0.84
    bly
    0.81
    bles
    0.80
     thank
    0.78
    brance
    0.78
    ingly
    0.77
    Act Density 0.014%

    No Known Activations