INDEX
    Explanations

    phrases indicating personal choice and preference

    New Auto-Interp
    Negative Logits
    acl
    -0.14
    ãĤ«ãĥ«
    -0.14
    edor
    -0.14
    ÑĢеÑī
    -0.14
     Gro
    -0.14
    audi
    -0.14
    burger
    -0.14
    ordes
    -0.14
    _globals
    -0.14
    inas
    -0.14
    POSITIVE LOGITS
     whether
    0.20
     Whether
    0.17
     ########.
    0.15
     decide
    0.14
    Whether
    0.14
     Incontri
    0.14
    whether
    0.14
     Interpret
    0.14
    æĹħ
    0.14
     interpret
    0.14
    Act Density 0.054%

    No Known Activations