INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     süre
    -0.06
    ω
    -0.06
    -0.06
    ATING
    -0.06
     این
    -0.06
    ='../
    -0.06
    ос
    -0.06
    	Title
    -0.06
     diffic
    -0.06
     cinemas
    -0.05
    POSITIVE LOGITS
    .th
    0.06
    dfunding
    0.06
    だけ
    0.06
     coerc
    0.06
    xia
    0.06
    "The
    0.06
    rg
    0.06
    astes
    0.06
    reds
    0.06
    accuracy
    0.06
    Act Density 0.902%

    No Known Activations