INDEX
    Explanations

    phrases indicating preference or intention to act

    New Auto-Interp
    Negative Logits
     ſtre
    -0.53
    verwijspagina
    -0.52
     purpoſe
    -0.52
     تعدى
    -0.49
    Thành
    -0.49
     paſſ
    -0.49
     pleaſure
    -0.48
     ſch
    -0.47
     ſy
    -0.47
    RenderAtEndOf
    -0.46
    POSITIVE LOGITS
     Prefer
    0.87
     prefer
    0.86
     prefers
    0.84
     preferring
    0.82
     preferred
    0.81
    Prefer
    0.78
    preferred
    0.77
    prefer
    0.76
     liever
    0.75
     préfé
    0.74
    Act Density 0.005%

    No Known Activations