INDEX
    Explanations

    friendly conversational expressions

    conversational expressions and interactions

    New Auto-Interp
    Negative Logits
     Instr
    -0.71
    Mobil
    -0.70
    Footnote
    -0.65
     Vaugh
    -0.62
     Nielsen
    -0.60
    ãĥ¼ãĥĨ
    -0.59
    Equ
    -0.58
     Barrett
    -0.58
     Mobil
    -0.58
     Restoration
    -0.57
    POSITIVE LOGITS
     dont
    1.15
     english
    1.10
     doesnt
    1.08
     didnt
    1.04
     alot
    1.03
     tho
    1.03
     americ
    1.02
    !!!!
    0.96
     fuck
    0.93
     pics
    0.93
    Act Density 0.701%

    No Known Activations