INDEX
    Explanations

    sentences that assert a statement or declaration

    New Auto-Interp
    Negative Logits
     defic
    -0.69
     Palest
    -0.68
    lde
    -0.68
     Advis
    -0.66
    ARP
    -0.61
    clinton
    -0.60
    bent
    -0.60
    Malley
    -0.59
    duc
    -0.58
    inker
    -0.58
    POSITIVE LOGITS
    Ĥİ
    0.75
    oreal
    0.69
    emort
    0.66
     exchanged
    0.66
     Simulator
    0.65
     Nights
    0.64
    etary
    0.64
     Carnage
    0.63
    rider
    0.63
    enegger
    0.61
    Act Density 0.000%

    No Known Activations