INDEX
    Explanations

    words related to health and well-being

    terms related to beneficial or detrimental effects

    New Auto-Interp
    Negative Logits
    buck
    -0.82
    oute
    -0.72
    pler
    -0.64
    metal
    -0.63
    ascus
    -0.62
    hyde
    -0.62
    rine
    -0.61
    bish
    -0.61
    herer
    -0.60
    Tube
    -0.60
    POSITIVE LOGITS
     outweigh
    0.77
    icial
    0.77
     outcomes
    0.73
     effects
    0.72
     synerg
    0.72
     influence
    0.71
    nerg
    0.70
     influences
    0.69
     effect
    0.68
     Advantage
    0.68
    Act Density 0.079%

    No Known Activations