INDEX
    Explanations

    prominent nouns and phrases indicating value or judgment

    New Auto-Interp
    Negative Logits
    ectors
    -0.17
    221
    -0.16
     itself
    -0.16
    veau
    -0.15
     stuff
    -0.15
    chw
    -0.14
    lettes
    -0.14
    ani
    -0.14
    971
    -0.14
    Stuff
    -0.14
    POSITIVE LOGITS
     few
    0.17
    hire
    0.16
     Atmospheric
    0.16
     two
    0.15
    isper
    0.15
    ảy
    0.14
    AKE
    0.13
    hung
    0.13
     thoughts
    0.13
    quence
    0.13
    Act Density 0.121%

    No Known Activations