INDEX
    Explanations

    phrases indicating an additional piece of information or emphasis

    phrases that include the term "in addition."

    New Auto-Interp
    Negative Logits
     Inher
    -0.70
    iste
    -0.69
    venge
    -0.68
    boys
    -0.65
    bugs
    -0.65
    rimp
    -0.64
    aja
    -0.62
    utters
    -0.62
    fare
    -0.61
    girls
    -0.61
    POSITIVE LOGITS
     Osw
    0.73
    olkien
    0.72
    igm
    0.71
    ãĤ½
    0.70
    xon
    0.69
    ivity
    0.68
    ipolar
    0.68
     materially
    0.67
    ngth
    0.66
    noon
    0.65
    Act Density 0.021%

    No Known Activations