INDEX
    Explanations

    phrases related to power dynamics and moral responsibility

    references or mentions of the concept of freedom

    New Auto-Interp
    Negative Logits
     iPod
    -0.65
     Henri
    -0.61
     lay
    -0.61
     Continental
    -0.61
     Dele
    -0.60
     email
    -0.60
     Crom
    -0.57
     Proud
    -0.57
     mandate
    -0.57
     Carrier
    -0.57
    POSITIVE LOGITS
    Ŀ
    4.29
    Ł
    1.81
    ľ
    1.80
    ¡
    1.76
    ļ
    1.73
    ª
    1.65
    ©
    1.65
    ĺ
    1.65
    Ĺ
    1.61
    ŀ
    1.61
    Act Density 0.267%

    No Known Activations