INDEX
    Explanations

    phrases indicating important consequences or impacts

    statements about consequences or effects

    New Auto-Interp
    Negative Logits
    fighters
    -0.75
    MQ
    -0.72
    cop
    -0.70
    cker
    -0.68
    fred
    -0.68
    bug
    -0.66
    few
    -0.66
    ced
    -0.66
    bows
    -0.66
    VERSION
    -0.65
    POSITIVE LOGITS
     implications
    0.89
     beyond
    0.84
    romeda
    0.82
     ramifications
    0.81
    ogene
    0.76
    uality
    0.71
     arising
    0.71
    afety
    0.69
     ripple
    0.68
     consequential
    0.67
    Act Density 0.053%

    No Known Activations