INDEX
    Explanations

    phrases indicating the presentation or outlining of information

    phrases indicating plans, proposals, or arguments being presented

    New Auto-Interp
    Negative Logits
     artifact
    -0.65
    ni
    -0.64
     luck
    -0.63
     unnoticed
    -0.61
    inos
    -0.60
     nose
    -0.60
    ão
    -0.59
    blers
    -0.59
    avery
    -0.58
    iol
    -0.57
    POSITIVE LOGITS
     outlines
    1.13
    \\\\\\\\
    0.95
     outline
    0.91
     outlining
    0.77
     Goals
    0.75
     how
    0.75
    WER
    0.74
     guidelines
    0.73
     plans
    0.72
    plan
    0.70
    Act Density 0.077%

    No Known Activations