INDEX
    Explanations

    concepts related to the evaluation and classification of claims and contributions based on power dynamics and societal structures

    New Auto-Interp
    Negative Logits
    enan
    -0.15
    allon
    -0.14
    nowrap
    -0.14
    legg
    -0.14
    ¼
    -0.14
    åįĵ
    -0.14
    edef
    -0.13
    _runner
    -0.13
     discontin
    -0.13
    zes
    -0.13
    POSITIVE LOGITS
    ç¨ĭ度
    0.30
     degree
    0.29
    degree
    0.26
     depending
    0.25
     extent
    0.24
     level
    0.23
     Degree
    0.22
     degrees
    0.21
    extent
    0.21
    Degree
    0.21
    Act Density 0.256%

    No Known Activations