INDEX
    Explanations

    phrases related to future plans or predictions

    expressions related to influence and decision-making

    New Auto-Interp
    Negative Logits
     indo
    -0.34
    )."
    -0.32
     Afgh
    -0.31
    "/>
    -0.31
    ]."
    -0.30
     disadvant
    -0.30
     vulner
    -0.30
     unemploy
    -0.29
     destro
    -0.29
     undermin
    -0.29
    POSITIVE LOGITS
    ivating
    0.33
    utterstock
    0.32
    ideshow
    0.31
    agonist
    0.31
    urable
    0.30
    asting
    0.30
    eaturing
    0.30
    heimer
    0.29
    iven
    0.29
    ering
    0.29
    Act Density 3.802%

    No Known Activations