INDEX
    Explanations

    phrases related to challenging situations or actions

    specific single-letter prefixes or abbreviations

    New Auto-Interp
    Negative Logits
    hyde
    -0.83
     substitutes
    -0.69
     constructs
    -0.68
     Fargo
    -0.68
     dwarves
    -0.66
     sacrific
    -0.65
     rescued
    -0.64
    FORE
    -0.64
     promot
    -0.63
     learners
    -0.62
    POSITIVE LOGITS
    anky
    1.03
    agging
    0.99
    ithering
    0.98
    umbling
    0.96
    ashing
    0.94
    attering
    0.94
    agg
    0.94
    angu
    0.94
    acious
    0.91
    erb
    0.91
    Act Density 0.238%

    No Known Activations