INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     of
    0.63
    و
    0.56
    0.54
    ال
    0.52
    0.47
    ول
    0.46
    0.46
     melanogaster
    0.45
    0.45
    0.45
    POSITIVE LOGITS
    f
    0.74
    y
    0.71
    k
    0.71
    b
    0.68
    et
    0.65
    at
    0.64
    n
    0.64
    e
    0.63
    v
    0.63
    Z
    0.61
    Act Density 0.015%

    No Known Activations