INDEX
    Explanations

    phrases related to strong opinions or stances on various topics

    phrases or terms that support a particular viewpoint or ideology

    New Auto-Interp
    Negative Logits
    ĸļ
    -0.91
    ãĤ¼ãĤ¦ãĤ¹
    -0.77
     Halls
    -0.68
     Sins
    -0.65
     Gorge
    -0.65
    inia
    -0.64
     Dickens
    -0.62
     Twain
    -0.62
     Brooks
    -0.60
     pains
    -0.60
    POSITIVE LOGITS
    digy
    1.45
    dding
    1.27
    actively
    1.26
    verbs
    1.16
    pelling
    1.15
    dig
    1.11
    ccess
    1.10
    strate
    1.09
    gressive
    1.06
    ctor
    1.06
    Act Density 0.015%

    No Known Activations