INDEX
    Explanations

    changes in perception and behavior regarding societal issues and individual actions

    New Auto-Interp
    Negative Logits
     itself
    -0.22
     à¤īसन
    -0.20
     Ø®ÙĪØ¯Ø´
    -0.20
     yourself
    -0.17
     оно
    -0.15
    ï¼Įå®ĥ
    -0.15
     Ø¢ÙĨ
    -0.15
     ê·¸ëĬĶ
    -0.15
     nó
    -0.14
     kendisi
    -0.14
    POSITIVE LOGITS
     their
    1.44
    their
    1.27
     Their
    1.19
    Their
    1.18
     THEIR
    1.06
     иÑħ
    1.03
     jejich
    0.96
     leur
    0.95
     leurs
    0.94
     loro
    0.94
    Act Density 2.760%

    No Known Activations