INDEX
    Explanations

    phrases indicating a directive or instruction

    statements regarding personal rights or autonomy

    New Auto-Interp
    Negative Logits
     guiActiveUnfocused
    -0.79
    ĺħ
    -0.69
    atro
    -0.66
    ascal
    -0.65
    ools
    -0.65
    oward
    -0.64
    assian
    -0.61
    ries
    -0.61
    Chair
    -0.61
     chaired
    -0.59
    POSITIVE LOGITS
    abouts
    0.76
    NESS
    0.72
    uni
    0.72
     Pyr
    0.68
    optional
    0.67
    ighting
    0.66
     Eps
    0.65
    pai
    0.64
    nz
    0.64
    ski
    0.60
    Act Density 0.000%

    No Known Activations