INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     fashions
    -0.08
    ראות
    -0.08
    ease
    -0.08
    specifier
    -0.07
    (ST
    -0.07
     stadium
    -0.07
    ailer
    -0.07
    ée
    -0.07
     Stadium
    -0.07
    oid
    -0.07
    POSITIVE LOGITS
     اللون
    0.14
    -white
    0.13
    -yellow
    0.12
    0.12
    -green
    0.12
     orange
    0.12
    黄色
    0.11
    _WHITE
    0.11
    -blue
    0.11
     blue
    0.11
    Act Density 0.094%

    No Known Activations