INDEX
    Explanations

    sentences related to legal or criminal activities

    New Auto-Interp
    Negative Logits
    adish
    -0.66
    dar
    -0.65
    ifter
    -0.61
    oufl
    -0.61
     Jou
    -0.59
    asp
    -0.58
    inas
    -0.58
    fare
    -0.57
    stall
    -0.55
    owler
    -0.55
    POSITIVE LOGITS
     thereto
    1.40
     to
    1.03
    entious
    0.89
    To
    0.84
     unto
    0.84
    itionally
    0.81
    itiz
    0.79
    ences
    0.77
    sov
    0.73
    to
    0.72
    Act Density 2.781%

    No Known Activations