INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    an
    0.44
    k
    0.42
    ان
    0.37
    ir
    0.37
    ش
    0.35
    ون
    0.35
    b
    0.33
    kita
    0.32
    िक
    0.32
    l
    0.31
    POSITIVE LOGITS
    N
    0.41
    с
    0.38
    з
    0.37
    C
    0.34
    <unused2008>
    0.34
    V
    0.33
    Y
    0.33
     be
    0.32
     políticos
    0.32
    有助于
    0.32
    Act Density 0.900%

    No Known Activations