INDEX
    Explanations

    inferior, exterior, posterior

    New Auto-Interp
    Negative Logits
     monopolist
    0.38
     思い
    0.38
    umlu
    0.38
    0.38
    shifting
    0.37
    0.37
     emblematic
    0.37
    awing
    0.37
    angwa
    0.37
    ట్లా
    0.36
    POSITIVE LOGITS
    ior
    0.90
    iors
    0.88
    IOR
    0.80
    iores
    0.75
    iour
    0.72
    iore
    0.69
    iori
    0.64
    ieur
    0.63
    iora
    0.62
    mediate
    0.61
    Act Density 0.010%

    No Known Activations