INDEX
    Explanations

    really and genuinely positive

    New Auto-Interp
    Negative Logits
     unimaginable
    0.63
     extravagant
    0.56
     infamous
    0.55
     shocking
    0.54
     horrifying
    0.53
     bizarre
    0.51
     drastic
    0.51
     extreme
    0.50
     unprecedented
    0.50
     якобы
    0.48
    POSITIVE LOGITS
     really
    0.89
     naprawdę
    0.77
     действительно
    0.75
     vraiment
    0.75
     wirklich
    0.74
    really
    0.73
     realmente
    0.71
     Really
    0.71
     gerçekten
    0.71
     genuinely
    0.71
    Act Density 0.004%

    No Known Activations