INDEX
    Explanations

    instances of words indicating permission or consent

    New Auto-Interp
    Negative Logits
    ész
    -0.15
    anje
    -0.15
    odash
    -0.14
    erin
    -0.14
    ارÙĩ
    -0.14
    asurer
    -0.14
    ulares
    -0.14
    /Area
    -0.13
    .LayoutStyle
    -0.13
     ngược
    -0.13
    POSITIVE LOGITS
     alone
    1.85
    alone
    1.56
     Alone
    1.50
    -alone
    1.19
     solo
    1.01
    Solo
    0.84
     seul
    0.83
     Solo
    0.83
     lone
    0.81
     seule
    0.78
    Act Density 0.393%

    No Known Activations