INDEX
    Explanations

    phrases related to arguments, debates, and comparisons

    New Auto-Interp
    Negative Logits
    ciating
    -0.80
    opoly
    -0.68
     enjoyment
    -0.62
     heals
    -0.62
    Reply
    -0.61
    ilty
    -0.60
    iban
    -0.58
    wake
    -0.56
    inters
    -0.55
     disapprove
    -0.55
    POSITIVE LOGITS
     resorted
    0.89
     recourse
    0.80
     devised
    0.79
     resort
    0.75
     teamed
    0.74
     opted
    0.73
    pmwiki
    0.72
     collaborated
    0.70
     enlisted
    0.69
    Firstly
    0.69
    Act Density 0.234%

    No Known Activations