INDEX
    Explanations

    references to rules or regulations regarding behavior

    after capitalized words

    New Auto-Interp
    Negative Logits
     recognized
    -0.39
    ますね
    -0.38
     recognised
    -0.38
     Loyalty
    -0.37
    のかな
    -0.37
    ripto
    -0.37
     Few
    -0.35
    recognized
    -0.35
     Message
    -0.35
     Recognizing
    -0.34
    POSITIVE LOGITS
    GEBURTSDATUM
    0.73
     betweenstory
    0.71
    ########.
    0.65
     AssemblyProduct
    0.59
    fuck
    0.57
    fucking
    0.56
     oprot
    0.56
     fuck
    0.52
    UrlResolution
    0.52
    FUCK
    0.51
    Act Density 0.506%

    No Known Activations