INDEX
    Explanations

    Negation/Exclusion

    New Auto-Interp
    Negative Logits
    一味
    -0.07
     shutting
    -0.07
     wherein
    -0.07
    Forge
    -0.06
    secondary
    -0.06
     //<
    -0.06
     overseeing
    -0.06
    	ev
    -0.06
     timestep
    -0.06
     Legislature
    -0.06
    POSITIVE LOGITS
    0.07
    0.07
    0.07
    Summer
    0.07
    上了
    0.06
    idos
    0.06
    0.06
     pontos
    0.06
    '])↵
    0.06
     able
    0.06
    Act Density 0.084%

    No Known Activations