INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    =-
    0.45
     haunting
    0.42
     frightening
    0.42
     conical
    0.41
     terrifying
    0.40
     frust
    0.40
    0.40
    0.38
    =-\
    0.38
    0.38
    POSITIVE LOGITS
     praise
    2.98
     praising
    2.75
     praises
    2.58
     Praise
    2.52
     प्रशंसा
    2.42
     praised
    2.39
     compliment
    2.33
     প্রশংসা
    2.30
     compliments
    2.27
     तारीफ
    2.17
    Act Density 0.088%

    No Known Activations