INDEX
    Explanations

    sentences that indicate statements or declarations

    New Auto-Interp
    Negative Logits
     utter
    -0.82
     installments
    -0.69
     royalty
    -0.67
     transact
    -0.66
     monopol
    -0.65
     delusion
    -0.65
     expense
    -0.65
     closet
    -0.64
     silly
    -0.64
     victories
    -0.64
    POSITIVE LOGITS
     Additionally
    0.94
     However
    0.90
     Similarly
    0.88
    <|endoftext|>
    0.88
     Furthermore
    0.88
     Moreover
    0.87
     Along
    0.85
     Adding
    0.85
     Instead
    0.85
     Afterwards
    0.85
    Act Density 0.289%

    No Known Activations