INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.08
     anne
    -0.08
    几个
    -0.08
     navigation
    -0.07
    くら
    -0.07
     factorial
    -0.07
     sigma
    -0.07
     gaire
    -0.07
    Goto
    -0.07
    дай
    -0.07
    POSITIVE LOGITS
     stance
    0.09
     protests
    0.08
     opinions
    0.08
    'heure
    0.08
     Patri
    0.08
     ou
    0.08
    0.08
     Opinions
    0.08
    .News
    0.08
     Commentary
    0.08
    Act Density 0.018%

    No Known Activations