INDEX
    Explanations

    mentions of user interface components

    New Auto-Interp
    Head Attr Weights
    0:0.07
    1:0.07
    2:0.08
    3:0.09
    4:0.08
    5:0.08
    6:0.09
    7:0.08
    8:0.08
    9:0.09
    10:0.06
    11:0.08
    Negative Logits
     unpop
    -2.87
     Samoa
    -2.75
     moratorium
    -2.62
    Ã
    -2.58
    ––
    -2.58
     Angola
    -2.53
     governors
    -2.47
     shortages
    -2.43
    ……………………
    -2.43
    -2.42
    POSITIVE LOGITS
     Towards
    2.62
    hyde
    2.59
    DonaldTrump
    2.59
    oward
    2.58
    afort
    2.58
    hedon
    2.58
     Hate
    2.49
     Dangerous
    2.49
     Dock
    2.46
     Digest
    2.44
    Act Density 0.000%

    No Known Activations