INDEX
    Explanations

    phrases related to ridicule and mockery

    New Auto-Interp
    Negative Logits
     ka
    -0.17
    ka
    -0.17
    Ka
    -0.16
    ót
    -0.15
     å¯
    -0.15
    нÑıÑĤ
    -0.15
    lor
    -0.14
     Ka
    -0.14
    ksi
    -0.14
    iplinary
    -0.14
    POSITIVE LOGITS
     everything
    0.45
     every
    0.43
     EVERY
    0.39
     Everything
    0.39
    _every
    0.38
     Every
    0.38
    Everything
    0.37
    everything
    0.37
     everyone
    0.37
    Every
    0.35
    Act Density 0.057%

    No Known Activations