INDEX
    Explanations

    references to political and societal issues

    New Auto-Interp
    Negative Logits
     manoeuv
    -0.70
     Trog
    -0.67
     laughter
    -0.66
     Vald
    -0.63
     blinded
    -0.62
     travellers
    -0.61
    boro
    -0.61
     agony
    -0.61
     doubles
    -0.60
     Alic
    -0.60
    POSITIVE LOGITS
    ³³³
    1.33
    ³³³³
    1.22
    ³³³³³³³³³³³³³³³³
    1.21
    ³³³³³³³³
    1.16
    ³³
    1.07
    Reason
    1.05
    Specifically
    0.98
    ccording
    0.95
    Firstly
    0.94
    Consider
    0.93
    Act Density 0.420%

    No Known Activations