INDEX
    Explanations

    phrases expressing doubt or assumptions about social justice issues

    New Auto-Interp
    Negative Logits
     Pink
    -0.16
     Loch
    -0.14
     infinity
    -0.14
    fal
    -0.14
     Infinity
    -0.14
    093
    -0.13
    .ng
    -0.13
     Tube
    -0.13
     outs
    -0.13
    497
    -0.13
    POSITIVE LOGITS
    ãĥ¼ãĥª
    0.18
    eks
    0.17
    unes
    0.16
    елÑİ
    0.16
    umber
    0.16
    oster
    0.15
    adele
    0.15
    ullan
    0.14
    itude
    0.14
    ject
    0.14
    Act Density 0.359%

    No Known Activations