INDEX
    Explanations

    selections or choices being made from a group of options

    instances of the word "selected"

    New Auto-Interp
    Negative Logits
    loo
    -0.77
    plane
    -0.72
    Net
    -0.69
    alone
    -0.66
    pir
    -0.64
     paw
    -0.63
    cer
    -0.62
    cow
    -0.62
    threat
    -0.61
    ga
    -0.61
    POSITIVE LOGITS
    selection
    0.83
    dinand
    0.81
    picked
    0.81
     Selection
    0.78
     randomly
    0.77
     selections
    0.77
    avorite
    0.74
    lime
    0.73
     "$:/
    0.72
     selected
    0.71
    Act Density 0.033%

    No Known Activations