INDEX
    Explanations

    language related to discussions, descriptions, and explanations of situations, decisions, and events

    New Auto-Interp
    Negative Logits
     confir
    -0.70
    iem
    -0.66
    depth
    -0.64
     promoter
    -0.61
    sonian
    -0.61
     reached
    -0.58
     gathers
    -0.57
    ipeg
    -0.56
     complied
    -0.55
    iasm
    -0.54
    POSITIVE LOGITS
    phas
    1.05
    enance
    0.94
     themselves
    0.78
    igate
    0.76
    igated
    0.76
     ourselves
    0.75
     favorably
    0.72
    igating
    0.71
     blame
    0.67
     virtues
    0.66
    Act Density 2.632%

    No Known Activations