INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    .player
    -0.07
    ,每
    -0.06
    ΜΠ
    -0.06
    			    	
    -0.06
    	right
    -0.06
    Ngoài
    -0.05
    -0.05
     Artem
    -0.05
    -0.05
    وران
    -0.05
    POSITIVE LOGITS
    emy
    0.07
     Nature
    0.07
    roduction
    0.07
     uname
    0.07
    acie
    0.06
     есте
    0.06
     Ce
    0.06
    (sw
    0.06
    punkt
    0.06
     Refer
    0.06
    Act Density 0.139%

    No Known Activations