INDEX
    Explanations

    comparisons between social concepts and actions

    New Auto-Interp
    Negative Logits
    large
    -0.76
    ilion
    -0.73
    chwitz
    -0.71
    ourced
    -0.71
    ugar
    -0.69
    Cover
    -0.68
    fml
    -0.68
     Pradesh
    -0.68
    edIn
    -0.68
    ourcing
    -0.68
    POSITIVE LOGITS
    osphere
    1.03
     extraord
    1.02
    liest
    0.95
     archetype
    0.94
     who
    0.93
    hood
    0.92
     whom
    0.84
    iest
    0.80
    who
    0.79
     himself
    0.79
    Act Density 0.306%

    No Known Activations