INDEX
    Explanations

    adjectives related to processes or abstract concepts

    critical phrases related to significant consequences or risks

    New Auto-Interp
    Negative Logits
    ;
    -0.75
    ,"
    -0.72
    %,
    -0.70
    .
    -0.69
    !,
    -0.69
    ,'
    -0.68
    .,
    -0.68
    .;
    -0.68
    %;
    -0.67
    .]
    -0.66
    POSITIVE LOGITS
    etheless
    0.90
    efully
    0.83
     newcom
    0.67
     teasp
    0.65
    sequently
    0.65
    urther
    0.65
    ventus
    0.64
    DragonMagazine
    0.61
    lly
    0.61
    PsyNetMessage
    0.61
    Act Density 0.952%

    No Known Activations