INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     oldValue
    -0.08
     isset
    -0.07
     viol
    -0.07
    (e
    -0.07
    Rules
    -0.07
     gördüğü
    -0.07
     wollte
    -0.07
    -0.07
    _PULL
    -0.07
     территории
    -0.06
    POSITIVE LOGITS
    0.07
     politically
    0.07
    0.07
    0.07
    0.07
    etak
    0.06
    0.06
    0.06
    listed
    0.06
    0.06
    Act Density 0.002%

    No Known Activations