INDEX
    Explanations

    mentions of physical activity

    instances of the word "exercise."

    New Auto-Interp
    Negative Logits
    fixed
    -0.86
    oho
    -0.76
    gets
    -0.73
    lines
    -0.73
    ymes
    -0.69
    lining
    -0.67
    ener
    -0.66
    lined
    -0.65
    ocide
    -0.64
    alez
    -0.63
    POSITIVE LOGITS
    ercise
    0.93
     exerc
    0.88
     exercise
    0.79
    issance
    0.77
     routines
    0.76
     Exercise
    0.76
     Pwr
    0.75
    icular
    0.74
    oleon
    0.73
    thur
    0.72
    Act Density 0.015%

    No Known Activations