INDEX
    Explanations

    phrases related to conflict and judgment

    New Auto-Interp
    Negative Logits
    ork
    -0.15
    aca
    -0.14
    wy
    -0.14
    rog
    -0.14
    azzi
    -0.14
     пÑĢоп
    -0.14
    αι
    -0.14
    965
    -0.14
    mom
    -0.14
    ovit
    -0.13
    POSITIVE LOGITS
    idth
    0.14
    uggle
    0.14
    umbnail
    0.14
    .px
    0.14
     Owners
    0.13
    ogui
    0.13
    Eigen
    0.13
    Implemented
    0.13
    dbuf
    0.13
    laughter
    0.13
    Act Density 0.249%

    No Known Activations