INDEX
    Explanations

    expressions related to explaining or clarifying something

    phrases indicating intentions or propositions

    New Auto-Interp
    Negative Logits
    ADRA
    -0.68
     watershed
    -0.64
     Lieberman
    -0.63
     vigilance
    -0.62
    meet
    -0.60
     novelty
    -0.60
    rule
    -0.59
    Dub
    -0.59
     TIM
    -0.59
     extrad
    -0.58
    POSITIVE LOGITS
    orah
    0.96
    aucus
    0.71
    ften
    0.70
    hower
    0.70
    eous
    0.70
    hesive
    0.69
    soType
    0.69
    oice
    0.68
    ptoms
    0.67
    oke
    0.67
    Act Density 0.082%

    No Known Activations