INDEX
    Explanations

    statements involving statistics and social commentary

    New Auto-Interp
    Negative Logits
    ...
    -0.55
    ...↵
    -0.53
    ...↵↵
    -0.50
    )...
    -0.42
     ...↵
    -0.39
     ...
    -0.38
    ......
    -0.36
    ...(
    -0.36
    ..."↵
    -0.36
    ...,
    -0.36
    POSITIVE LOGITS
    0.32
    .–
    0.32
    0.31
     ðŁĻĤ
    0.31
     –↵
    0.31
     â̦.
    0.30
     ðŁĻĤ↵↵
    0.30
    â̦â̦â̦â̦
    0.29
    â̦â̦â̦â̦â̦â̦â̦â̦
    0.29
    –↵↵
    0.28
    Act Density 1.009%

    No Known Activations