INDEX
    Explanations

    words related to decision-making or actions

    phrases related to conditions or actions that are contingent on prior events

    New Auto-Interp
    Negative Logits
    emale
    -0.70
    alde
    -0.66
    cised
    -0.62
    acial
    -0.60
    apolis
    -0.57
    akedown
    -0.55
    allel
    -0.55
    FOX
    -0.54
    idal
    -0.54
     Cham
    -0.54
    POSITIVE LOGITS
    something
    1.78
    anything
    1.65
     things
    1.63
     something
    1.60
    things
    1.57
    nothing
    1.56
    THING
    1.56
     Something
    1.55
     Things
    1.54
     stuff
    1.51
    Act Density 0.363%

    No Known Activations