INDEX
    Explanations

    words related to general commentary or opinions

    references to group dynamics and collective actions or opinions

    New Auto-Interp
    Negative Logits
    "},"
    -0.56
     namely
    -0.54
     viz
    -0.50
    ').
    -0.49
     '.
    -0.47
    Firstly
    -0.43
     ."
    -0.42
    ':
    -0.42
    ".[
    -0.42
     Whilst
    -0.37
    POSITIVE LOGITS
    ,
    1.04
    ?,
    0.81
    !,
    0.81
    ,,
    0.81
    ,...
    0.79
    .,
    0.78
    ,[
    0.73
    *,
    0.72
    +,
    0.70
    ,-
    0.70
    Act Density 3.160%

    No Known Activations