INDEX
    Explanations

    sentences asserting a critical perspective on societal issues or individuals

    New Auto-Interp
    Negative Logits
    ONSORED
    -0.61
     Heights
    -0.60
     favour
    -0.57
    ointed
    -0.56
    HI
    -0.54
     Erit
    -0.54
    elta
    -0.53
     Tamil
    -0.53
     Honour
    -0.52
     Magicka
    -0.52
    POSITIVE LOGITS
    abouts
    1.45
    upon
    1.15
    fore
    0.89
    after
    0.78
    FORE
    0.75
    etheless
    0.74
     ain
    0.73
    with
    0.72
    'll
    0.71
     isn
    0.70
    Act Density 0.129%

    No Known Activations