INDEX
    Explanations

    phrases related to legal actions and consequences

    punctuation marks and their frequencies in the text

    New Auto-Interp
    Negative Logits
     conclud
    -0.68
     describ
    -0.66
     answ
    -0.60
     grounding
    -0.60
     baseline
    -0.60
     aggregation
    -0.59
    eatures
    -0.59
     prescriptions
    -0.59
     concess
    -0.59
     isolation
    -0.58
    POSITIVE LOGITS
    nee
    0.87
    rama
    0.82
     etc
    0.77
     Kinnikuman
    0.70
    uthor
    0.69
    icio
    0.68
     supra
    0.68
    oshi
    0.68
    080
    0.67
    wait
    0.67
    Act Density 0.200%

    No Known Activations