INDEX
    Explanations

    expressions of gratitude and appreciation

    New Auto-Interp
    Negative Logits
    _refl
    -0.15
    raw
    -0.13
    uko
    -0.13
    tent
    -0.13
    .gov
    -0.13
    ativ
    -0.13
    akan
    -0.13
    å±¥
    -0.13
    ark
    -0.13
    log
    -0.13
    POSITIVE LOGITS
     privileged
    0.25
     privilege
    0.22
     opportunity
    0.21
    privileged
    0.21
     able
    0.19
    _timing
    0.18
     fortunate
    0.18
     enough
    0.17
     privileges
    0.17
     priv
    0.17
    Act Density 0.037%

    No Known Activations