INDEX
    Explanations

    preferences and choices regarding various options

    New Auto-Interp
    Negative Logits
     ſever
    -0.62
     itſelf
    -0.61
     gonz
    -0.57
     pleaſure
    -0.57
    AISSEE
    -0.55
     canst
    -0.54
     rasc
    -0.53
    IRIS
    -0.53
     greateſt
    -0.53
    tagHelper
    -0.53
    POSITIVE LOGITS
     prefer
    1.48
     prefers
    1.33
     Prefer
    1.33
    prefer
    1.30
     preferred
    1.29
     preferring
    1.29
    Prefer
    1.27
     preference
    1.18
    preferred
    1.16
    Preferred
    1.09
    Act Density 0.226%

    No Known Activations