INDEX
    Explanations

    words related to illegality or illegal actions

    New Auto-Interp
    Negative Logits
    ็จ
    -0.81
    𝘇
    -0.75
    kast
    -0.73
    Kast
    -0.71
     Gump
    -0.71
    CELLANEOUS
    -0.71
     Waray
    -0.69
     envies
    -0.68
    ValueStyle
    -0.66
    Rost
    -0.66
    POSITIVE LOGITS
     illeg
    1.08
     Illegal
    1.00
    illegal
    0.97
     illegal
    0.96
    Illegal
    0.88
     ilegal
    0.88
     illegally
    0.86
     gills
    0.85
     ileg
    0.85
    LEGAL
    0.82
    Act Density 0.006%

    No Known Activations