INDEX
    Explanations

    expressions of emotional experiences and reflections

    New Auto-Interp
    Negative Logits
     attract
    -0.18
     attracted
    -0.15
     ple
    -0.15
     Shapes
    -0.15
    uite
    -0.15
    avou
    -0.15
     Ñĥдив
    -0.14
     unp
    -0.14
     à¤Ĩà¤ķर
    -0.14
     attractiveness
    -0.14
    POSITIVE LOGITS
     leave
    0.22
     send
    0.21
     left
    0.20
     sends
    0.19
     leaving
    0.19
    leave
    0.19
     leaves
    0.18
    left
    0.18
     Send
    0.18
    Send
    0.18
    Act Density 0.124%

    No Known Activations