INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     كار
    -0.07
    بين
    -0.07
    -0.06
     Boris
    -0.06
     zi
    -0.06
    -v
    -0.06
     liệt
    -0.06
    	T
    -0.06
    _prev
    -0.06
    ारण
    -0.06
    POSITIVE LOGITS
    /values
    0.07
     invert
    0.07
     Clause
    0.06
    =>{↵
    0.06
    _hot
    0.06
     enim
    0.06
     Japanese
    0.06
     italia
    0.06
     พระ
    0.06
    QUENCY
    0.06
    Act Density 0.000%

    No Known Activations