INDEX
    Explanations

    concepts related to theoretical frameworks or models

    New Auto-Interp
    Head Attr Weights
    0:0.04
    1:0.02
    2:0.07
    3:0.09
    4:0.02
    5:0.04
    6:0.13
    7:0.21
    8:0.05
    9:0.05
    10:0.06
    11:0.16
    Negative Logits
     spons
    -1.52
     Ukrain
    -1.21
    appropriately
    -1.18
    -1.18
    pause
    -1.16
    wagen
    -1.10
     Roundup
    -1.09
    mods
    -1.07
    respect
    -1.06
    vr
    -1.05
    POSITIVE LOGITS
     Obj
    1.18
     meaning
    1.14
    ographical
    1.11
     plain
    1.04
    ography
    1.02
     phrases
    1.01
    rencies
    1.00
    0.99
    asketball
    0.99
     constructs
    0.98
    Act Density 0.003%

    No Known Activations