INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.07
     hiçbir
    -0.06
    .MM
    -0.06
    ANE
    -0.06
    ibile
    -0.05
    OLTIP
    -0.05
    Lists
    -0.05
     wurde
    -0.05
    ек
    -0.05
    -cols
    -0.05
    POSITIVE LOGITS
    	results
    0.07
    .Key
    0.07
     decency
    0.06
     obtain
    0.06
     BT
    0.06
    {o
    0.06
    0.06
     concentrations
    0.06
    Opts
    0.06
     variations
    0.06
    Act Density 0.005%

    No Known Activations