INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    1.47
    b
    1.24
    x
    1.14
    v
    1.09
    ある
    1.05
    d
    0.98
    د
    0.93
    c
    0.92
    0.91
    ۴
    0.91
    POSITIVE LOGITS
    1.23
    </h3>
    0.95
     it
    0.95
     Islam
    0.94
    0.93
    0
    0.89
    <0x0D>
    0.86
     islam
    0.82
     a
    0.81
     isomerization
    0.80
    Act Density 0.002%

    No Known Activations