INDEX
    Explanations

    references to personal feelings of security and stability in various contexts

    New Auto-Interp
    Head Attr Weights
    0:0.02
    1:0.02
    2:0.14
    3:0.14
    4:0.17
    5:0.03
    6:0.05
    7:0.20
    8:0.03
    9:0.04
    10:0.06
    11:0.05
    Negative Logits
     assumption
    -1.72
    Redditor
    -1.69
     Attribution
    -1.67
     irrespective
    -1.65
     implicitly
    -1.62
    rather
    -1.61
     presumption
    -1.56
     indirectly
    -1.51
    paralle
    -1.47
     Cosponsors
    -1.43
    POSITIVE LOGITS
    utonium
    1.97
     touring
    1.80
    chester
    1.71
    bledon
    1.67
    licks
    1.67
    ulkan
    1.66
    angering
    1.60
    usha
    1.59
    ubs
    1.58
    acly
    1.58
    Act Density 0.000%

    No Known Activations