INDEX
    Explanations

    phrases related to consequences, actions, and decision-making

    expressions of caution or concern about actions and their consequences

    New Auto-Interp
    Negative Logits
    htaking
    -0.69
    interstitial
    -0.67
    orious
    -0.61
     Built
    -0.60
     Skin
    -0.58
    æľ
    -0.56
     fame
    -0.56
    itled
    -0.55
    culosis
    -0.54
     Birth
    -0.53
    POSITIVE LOGITS
     defe
    0.70
     quo
    0.66
     anyway
    0.64
    etheless
    0.64
     answ
    0.64
     administr
    0.62
     uncertainties
    0.61
     downstream
    0.61
     redund
    0.60
     inaction
    0.59
    Act Density 2.781%

    No Known Activations