INDEX
    Explanations

    general expressions of gratitude or acknowledgments in a text

    New Auto-Interp
    Negative Logits
    Âł
    -0.19
     
    -0.16
    .↵↵
    -0.16
    -0.15
     .↵↵
    -0.14
    ÂĿ
    -0.14
     "
    -0.14
    -0.14
     j
    -0.14
     Âł
    -0.13
    POSITIVE LOGITS
    0.26
    ''↵
    0.26
    ा↵
    0.25
    à¥ĩà¤Ĥ↵
    0.25
    à¥ĩ↵
    0.23
    ีà¹ī↵
    0.22
    ()↵
    0.20
    à¥Ģ↵
    0.20
    ี↵
    0.20
    ãĢį↵
    0.20
    Act Density 0.003%

    No Known Activations