INDEX
    Explanations

    phrases that indicate an action or decision made by a spokesperson or authority figure

    the word "that" in various contexts

    New Auto-Interp
    Negative Logits
    ãĤ©
    -0.80
    ãĤ¼ãĤ¦ãĤ¹
    -0.73
    greg
    -0.69
    tein
    -0.66
    EMBER
    -0.66
    INFO
    -0.66
    Tank
    -0.65
    Ü
    -0.65
    ãĥĺ
    -0.65
    Thumbnail
    -0.65
    POSITIVE LOGITS
     although
    1.37
     while
    1.20
     despite
    1.18
     "[
    1.15
     whilst
    1.04
     whereas
    1.02
     unlike
    1.02
     unless
    1.00
     "â̦
    0.92
     if
    0.90
    Act Density 0.231%

    No Known Activations