INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     CAT
    -0.07
     Crest
    -0.07
    -0.07
     Work
    -0.07
     사랑
    -0.07
    edn
    -0.07
    ufact
    -0.06
     Tech
    -0.06
     PMC
    -0.06
     Lean
    -0.06
    POSITIVE LOGITS
     newspaper
    0.14
     newspapers
    0.13
     Newsp
    0.12
     Newspaper
    0.10
    0.08
    aper
    0.07
     disappearing
    0.06
    (Local
    0.06
    0.06
     disappear
    0.06
    Act Density 0.003%

    No Known Activations