INDEX
    Explanations

    phrases centered around questioning legitimacy and authority

    the validity/legitimacy

    New Auto-Interp
    Negative Logits
     hint
    -0.53
     никак
    -0.48
     réflexion
    -0.48
    Increment
    -0.47
     saja
    -0.47
     diatur
    -0.46
    geber
    -0.46
    dif
    -0.46
    increment
    -0.46
     discussion
    -0.45
    POSITIVE LOGITS
     validity
    1.67
    validity
    1.30
     legitimacy
    1.29
     effectiveness
    1.25
     veracity
    1.21
     suitability
    1.20
     authenticity
    1.19
     accuracy
    1.19
     correctness
    1.19
     adequacy
    1.16
    Act Density 0.625%

    No Known Activations