INDEX
    Explanations

    health-related content, particularly focusing on medical studies and events related to physical harm

    New Auto-Interp
    Negative Logits
     RELE
    -0.75
     Uni
    -0.68
     infographic
    -0.68
     wholesale
    -0.67
    ãĥĩãĤ£
    -0.66
     liv
    -0.66
     Creat
    -0.65
     apr
    -0.62
    ãĤ´ãĥ³
    -0.61
     Unle
    -0.60
    POSITIVE LOGITS
    anyahu
    0.81
    rette
    0.79
    orf
    0.79
    rient
    0.77
    ork
    0.75
    ady
    0.75
    intel
    0.74
    jab
    0.74
    vals
    0.73
    ipal
    0.73
    Act Density 0.282%

    No Known Activations