INDEX
    Explanations

    phrases where someone is expressing an opinion or argument

    statements or claims made by individuals

    New Auto-Interp
    Negative Logits
    ptives
    -0.72
    adesh
    -0.70
    theless
    -0.65
    ================================================================
    -0.63
    prus
    -0.63
    ãĥ¼ãĥĨ
    -0.62
    estern
    -0.60
    https
    -0.60
    uador
    -0.60
    FTWARE
    -0.60
    POSITIVE LOGITS
    ,,
    0.76
    *,
    0.72
    ,
    0.71
     convinc
    0.70
     bluntly
    0.66
     goodbye
    0.61
    !,
    0.60
     omin
    0.59
     cyn
    0.59
     emphatically
    0.58
    Act Density 0.187%

    No Known Activations