INDEX
    Explanations

    references to physical attacks or conflicts

    references to mugging incidents and associated discussions

    New Auto-Interp
    Negative Logits
    cells
    -0.75
    cel
    -0.75
    ed
    -0.75
    ters
    -0.74
    coe
    -0.73
    nd
    -0.72
    ducers
    -0.72
    ek
    -0.71
    er
    -0.69
    seed
    -0.69
    POSITIVE LOGITS
    ifully
    0.90
    iless
    0.80
    iful
    0.78
     Hots
    0.77
    enance
    0.74
    istics
    0.70
    lust
    0.67
    llan
    0.65
    ãĢij
    0.65
    ainment
    0.65
    Act Density 0.158%

    No Known Activations