INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.08
     Muhammed
    -0.07
    uckle
    -0.07
     поб
    -0.06
    -0.06
     demol
    -0.06
     Cow
    -0.06
     atmos
    -0.06
     disgrace
    -0.06
     Sul
    -0.06
    POSITIVE LOGITS
     imprint
    0.14
    prints
    0.08
    -ft
    0.07
     instinct
    0.06
    indx
    0.06
     faded
    0.06
     fades
    0.06
    ingredient
    0.06
     daytime
    0.06
    >m
    0.06
    Act Density 0.001%

    No Known Activations