INDEX
    Explanations

    names and references to various individuals, such as celebrities and sports figures

    repeated mentions of the substring "ra"

    New Auto-Interp
    Negative Logits
    iaries
    -0.78
    ij士
    -0.74
    regor
    -0.72
    lace
    -0.71
     curfew
    -0.69
     GOODMAN
    -0.69
    é¾
    -0.67
     charism
    -0.66
     MacArthur
    -0.65
     OW
    -0.63
    POSITIVE LOGITS
    irie
    1.21
    ven
    1.19
    fter
    1.18
    fters
    1.16
    eus
    1.11
    ving
    1.07
    xon
    1.05
    ppy
    1.03
    plets
    1.03
    ffe
    1.00
    Act Density 0.025%

    No Known Activations