INDEX
    Explanations

    words related to exerting force or pressure on someone or something

    references to coercion or force applied to individuals

    New Auto-Interp
    Negative Logits
    uded
    -0.65
    izoph
    -0.58
    ILA
    -0.57
    aming
    -0.57
    von
    -0.56
    ording
    -0.55
     artif
    -0.55
    effect
    -0.55
     Shap
    -0.54
    angered
    -0.54
    POSITIVE LOGITS
     into
    1.10
     to
    1.03
    into
    0.98
     toward
    0.96
     onward
    0.95
     onto
    0.95
     towards
    0.92
     onwards
    0.87
     thereto
    0.86
    To
    0.79
    Act Density 0.164%

    No Known Activations