INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Taylor
    -0.07
    Info
    -0.06
    3
    -0.06
     makes
    -0.06
     imag
    -0.06
     iso
    -0.06
    Hello
    -0.06
    X
    -0.06
     buildup
    -0.06
     matrices
    -0.06
    POSITIVE LOGITS
    ant
    0.10
    enant
    0.09
    ent
    0.09
    ντ
    0.09
    dent
    0.09
    ANT
    0.08
    ENT
    0.08
    nant
    0.08
    ants
    0.08
    رض
    0.08
    Act Density 0.101%

    No Known Activations