INDEX
    Explanations

    expressions of emotional intensity or strong feelings

    New Auto-Interp
    Negative Logits
     "
    -0.23
    s
    -0.21
    ï¼ļ"
    -0.19
     '
    -0.18
    'nin
    -0.18
    Ùĩ
    -0.17
    -"
    -0.16
     ("
    -0.16
    ooth
    -0.15
    ..."
    -0.15
    POSITIVE LOGITS
    ing
    0.24
    Ø©
    0.19
    (
    0.18
    er
    0.17
     “[
    0.17
    ed
    0.17
    页éĿ¢åŃĺæ¡£å¤ĩ份
    0.16
    0.15
    ,”
    0.15
    ehr
    0.15
    Act Density 0.042%

    No Known Activations