INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ک
    1.31
    ק
    1.28
    اں
    1.25
    1.13
    ง่าย
    1.12
    ма
    1.11
    อำ
    1.07
    ată
    1.05
     can
    1.04
    지를
    1.04
    POSITIVE LOGITS
    1.70
     of
    1.57
    Cup
    1.48
    9
    1.43
     Cups
    1.39
    8
    1.34
    6
    1.32
    1.32
    1
    1.30
     Cup
    1.27
    Act Density 0.012%

    No Known Activations