INDEX
    Explanations

    expressions of free will and consent

    New Auto-Interp
    Negative Logits
     ìĦĿ
    -0.16
    imson
    -0.15
    è͵
    -0.14
    iros
    -0.14
    _Params
    -0.14
    (iOS
    -0.14
     ([[
    -0.13
    íĽĪ
    -0.13
    ë¡Ģ
    -0.13
     setHidden
    -0.13
    POSITIVE LOGITS
     voluntary
    0.60
     vol
    0.53
    vol
    0.52
     Vol
    0.51
     volunt
    0.49
     voluntarily
    0.49
    Vol
    0.47
     VOL
    0.47
     volont
    0.43
    _vol
    0.41
    Act Density 0.352%

    No Known Activations