INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    одо
    -0.16
    .cls
    -0.15
    .cf
    -0.14
    tering
    -0.14
     Wikipedia
    -0.14
    ",__
    -0.14
    تاÙĨ
    -0.14
     اÙĦÙħÙĤ
    -0.14
     wikipedia
    -0.13
     Nez
    -0.13
    POSITIVE LOGITS
     bit
    0.42
    bit
    0.39
     tiny
    0.31
    .bit
    0.29
    tiny
    0.28
    Bit
    0.27
    _bit
    0.26
    	bit
    0.25
     Bit
    0.25
    goo
    0.25
    Act Density 0.013%

    No Known Activations