INDEX
    Explanations

    references to collective entities or groups, particularly in political or economic contexts

    New Auto-Interp
    Head Attr Weights
    0:0.06
    1:0.02
    2:0.25
    3:0.09
    4:0.04
    5:0.06
    6:0.02
    7:0.01
    8:0.23
    9:0.12
    10:0.03
    11:0.02
    Negative Logits
    FORE
    -1.62
    yip
    -1.42
    gemony
    -1.32
    llor
    -1.20
    thora
    -1.20
    ukemia
    -1.19
    ufact
    -1.17
    asures
    -1.17
    lement
    -1.15
    terday
    -1.15
    POSITIVE LOGITS
     wedge
    1.40
     upd
    1.34
     wrench
    1.23
     trem
    1.19
     hither
    1.18
    eros
    1.17
     modem
    1.16
     uphill
    1.13
     ster
    1.11
     Auditor
    1.10
    Act Density 0.016%

    No Known Activations