INDEX
    Explanations

    expressions of opinion or preference related to products or experiences

    New Auto-Interp
    Negative Logits
     elsewhere
    -0.17
     someone
    -0.15
    åĪ¥
    -0.15
     occasionally
    -0.14
     periodically
    -0.14
     sometimes
    -0.14
    åı¦
    -0.13
     another
    -0.13
    à¸ļาà¸ĩ
    -0.13
     Sometimes
    -0.13
    POSITIVE LOGITS
    æīĢæľī
    0.72
     all
    0.70
     every
    0.68
     wszyst
    0.64
    ãģĻãģ¹ãģ¦
    0.63
     semua
    0.61
     모ëĵł
    0.60
     everything
    0.60
     вÑģеÑħ
    0.57
     todas
    0.55
    Act Density 1.118%

    No Known Activations