INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.08
    сов
    -0.07
    His
    -0.07
    state
    -0.07
    -0.06
    Threshold
    -0.06
    供电
    -0.06
    ۈ
    -0.06
     hüküm
    -0.06
    -0.06
    POSITIVE LOGITS
     dej
    0.07
    ており
    0.07
     downloading
    0.07
     indexed
    0.07
    0.07
    应邀
    0.07
    agara
    0.06
     ARGS
    0.06
    0.06
    完整性
    0.06
    Act Density 0.001%

    No Known Activations