INDEX
    Explanations

    phrases related to online content and communication

    phrases related to sentiments of truth or honesty

    New Auto-Interp
    Negative Logits
    appropriately
    -0.76
     Annex
    -0.69
     itself
    -0.62
    untarily
    -0.61
     Brav
    -0.59
     poorest
    -0.59
     Brist
    -0.58
     hosp
    -0.56
     inciner
    -0.56
    liest
    -0.56
    POSITIVE LOGITS
    20439
    0.87
    plet
    0.83
    tions
    0.79
    isations
    0.79
    utions
    0.75
    endment
    0.75
    isms
    0.73
    bernatorial
    0.72
    Reviewer
    0.71
    izations
    0.70
    Act Density 0.509%

    No Known Activations