INDEX
    Explanations

    simply, polite, fridge, ventilate, air, bird

    New Auto-Interp
    Negative Logits
    became
    0.47
    *
    0.46
    name
    0.45
    tiny
    0.45
    tsd
    0.45
    ri
    0.45
    are
    0.44
    res
    0.43
     получила
    0.42
    Q
    0.42
    POSITIVE LOGITS
     extravagance
    0.48
     رفت
    0.46
     direcion
    0.45
     mockery
    0.45
     rollback
    0.45
     nonchal
    0.44
     fanatic
    0.43
    𝐞
    0.43
    0.42
     faça
    0.41
    Act Density 0.008%

    No Known Activations