INDEX
    Explanations

    phrases indicating choice or options

    New Auto-Interp
    Negative Logits
     selected
    -0.86
     Selected
    -0.84
    Selected
    -0.78
    selected
    -0.77
     SELECTED
    -0.72
     selezion
    -0.65
     sélectionné
    -0.63
     Selecting
    -0.55
     sélectionnés
    -0.54
     seleccionadas
    -0.54
    POSITIVE LOGITS
     choice
    2.05
    choice
    1.82
     Choice
    1.73
     CHOICE
    1.66
    Choice
    1.63
     choix
    1.53
    CHOICE
    1.48
     cho
    1.46
     choices
    1.43
     CHO
    1.35
    Act Density 0.329%

    No Known Activations