INDEX
    Explanations

    phrases implying the truth or validity of a statement

    phrases that indicate contrasts or negations of statements

    New Auto-Interp
    Negative Logits
    ngth
    -0.73
     freezes
    -0.70
    umbnails
    -0.69
    alin
    -0.63
     clone
    -0.61
     Scores
    -0.61
    usercontent
    -0.59
    enaries
    -0.59
    ription
    -0.58
     hail
    -0.56
    POSITIVE LOGITS
     true
    1.46
    true
    1.40
     happening
    1.29
     untrue
    1.15
    false
    0.99
     TRUE
    0.99
     happen
    0.98
     occurring
    0.98
     possible
    0.93
     achievable
    0.92
    Act Density 0.162%

    No Known Activations