INDEX
    Explanations

    sentences emphasizing communal responsibility and social bonds

    New Auto-Interp
    Negative Logits
     ÙħÙĨÙĩ
    -0.15
    ooo
    -0.14
    haven
    -0.13
    oton
    -0.13
    abh
    -0.13
    ait
    -0.13
    StateManager
    -0.13
    _Private
    -0.13
    obao
    -0.13
    داد
    -0.12
    POSITIVE LOGITS
     whether
    0.23
    whether
    0.18
     Whether
    0.17
     yes
    0.17
    ""
    0.16
    yes
    0.16
    Whether
    0.16
     or
    0.15
     brick
    0.15
     even
    0.15
    Act Density 0.402%

    No Known Activations