INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     yytype
    -0.06
    ervals
    -0.06
    anax
    -0.06
     worksheet
    -0.06
    ,GL
    -0.06
     payroll
    -0.05
    ‚Ì
    -0.05
     PVC
    -0.05
    부분
    -0.05
    ไปย
    -0.05
    POSITIVE LOGITS
    _SUR
    0.08
    0.08
    инг
    0.07
     cog
    0.07
     उस
    0.07
     kim
    0.07
    ATHER
    0.07
     pornography
    0.07
     spa
    0.07
     eigen
    0.06
    Act Density 0.002%

    No Known Activations