INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    is
    1.33
    이면
    1.23
    1.22
    ס
    1.16
    이었다
    1.12
     
    1.12
    ي
    1.08
    이었
    1.06
    1.05
    در
    1.04
    POSITIVE LOGITS
    F
    1.23
    ки
    1.18
    L
    1.16
    ien
    1.15
    K
    1.15
    v
    1.09
    ches
    1.05
    B
    1.04
    letal
    1.02
    ill
    1.00
    Act Density 0.006%

    No Known Activations