INDEX
    Explanations

    words indicating an opinion or statement

    phrases that begin with "That," indicating a focus on statements or claims

    New Auto-Interp
    Negative Logits
    20439
    -0.83
    ¶ħ
    -0.75
    busters
    -0.74
    oby
    -0.71
    ccoli
    -0.71
    owder
    -0.69
    ussia
    -0.69
    earch
    -0.69
    iege
    -0.66
    emis
    -0.66
    POSITIVE LOGITS
     said
    1.14
    's
    1.04
     being
    0.96
    cher
    0.95
     includes
    0.93
     aside
    0.93
     doesn
    0.92
     begs
    0.89
     leaves
    0.88
     means
    0.87
    Act Density 0.087%

    No Known Activations