INDEX
    Explanations

    general statements or overviews in text

    New Auto-Interp
    Negative Logits
    aily
    -0.73
    him
    -0.70
    ËĪ
    -0.70
    fest
    -0.66
    imm
    -0.64
    ocaust
    -0.62
    aciously
    -0.61
    ocracy
    -0.60
    gio
    -0.59
    vg
    -0.58
    POSITIVE LOGITS
    adays
    0.98
     speaking
    0.89
    ccording
    0.80
    entimes
    0.77
     Speaking
    0.73
     we
    0.73
     there
    0.73
    terday
    0.73
     though
    0.69
     commenters
    0.69
    Act Density 0.131%

    No Known Activations