INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     itching
    -0.08
     malfunction
    -0.08
     Summer
    -0.08
     komis
    -0.08
     Sunny
    -0.08
    卫生
    -0.08
     vandal
    -0.08
     workshop
    -0.07
    -0.07
    Spring
    -0.07
    POSITIVE LOGITS
     asym
    0.11
    0.11
    0.10
    趋势
    0.10
     logarith
    0.10
     approximation
    0.09
     approxim
    0.09
     Absch
    0.08
     expansions
    0.08
    0.08
    Act Density 0.022%

    No Known Activations