INDEX
    Explanations

    phrases related to social and political issues

    New Auto-Interp
    Negative Logits
     ILCS
    -0.70
    TEXTURE
    -0.66
     Flavoring
    -0.63
    ModLoader
    -0.63
    Ire
    -0.62
     Lerner
    -0.60
    ULL
    -0.56
    Natural
    -0.56
    Snap
    -0.55
    LET
    -0.54
    POSITIVE LOGITS
     ..."
    0.97
    "}
    0.96
    !".
    0.95
    "?
    0.91
    ").
    0.91
    ?".
    0.91
    "/>
    0.89
     equals
    0.89
    .")
    0.88
     sucks
    0.86
    Act Density 0.187%

    No Known Activations