INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     celebrating
    -0.08
    ksel
    -0.08
    /$',
    -0.08
    annah
    -0.07
    ஸ்
    -0.07
     today's
    -0.07
     celebrated
    -0.07
    /'
    -0.07
     подвер
    -0.07
     gön
    -0.07
    POSITIVE LOGITS
     deterr
    0.19
     intimid
    0.17
     intimidation
    0.16
     intimidating
    0.16
     deter
    0.15
     scare
    0.14
     intimidated
    0.13
     repel
    0.13
     assust
    0.13
     डर
    0.12
    Act Density 0.047%

    No Known Activations