INDEX
    Explanations

    phrases related to personal intentions or feelings

    phrases related to feelings, issues, and societal or communal circumstances

    New Auto-Interp
    Negative Logits
    ãĤ¨ãĥ«
    -0.71
    etheless
    -0.65
     looph
    -0.62
    ãĤ´ãĥ³
    -0.62
    ãĥı
    -0.60
    .):
    -0.60
    ãĥ«
    -0.59
    ãĤ¦ãĤ¹
    -0.58
    ËĪ
    -0.57
    ãĥ¯ãĥ³
    -0.55
    POSITIVE LOGITS
     ..."
    1.53
     â̦"
    1.44
    ,"
    1.43
    ..."
    1.38
    â̦"
    1.32
    ?"
    1.24
    )."
    1.21
     ...
    1.12
     we
    1.12
    ),"
    1.11
    Act Density 0.394%

    No Known Activations