INDEX
    Explanations

    specific criteria mentioned in text

    terms and phrases related to evaluation standards or guidelines

    New Auto-Interp
    Negative Logits
    joy
    -0.72
    vironment
    -0.71
    resent
    -0.70
    orld
    -0.69
    ership
    -0.69
    hand
    -0.66
    lique
    -0.66
    owners
    -0.65
    rodu
    -0.65
    ston
    -0.63
    POSITIVE LOGITS
     criteria
    1.28
    erion
    1.01
     criterion
    0.98
    witz
    0.81
     cutoff
    0.80
     thresholds
    0.78
    DragonMagazine
    0.72
    pillar
    0.71
    ifiers
    0.70
    idelines
    0.70
    Act Density 0.019%

    No Known Activations