INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    quart
    -0.75
    puter
    -0.74
    kson
    -0.70
    entious
    -0.69
     Anthem
    -0.67
    eneg
    -0.66
    endo
    -0.66
    estic
    -0.64
    umin
    -0.64
    FG
    -0.64
    POSITIVE LOGITS
     instincts
    1.02
    ously
    0.98
     survival
    0.89
    arily
    0.89
     instinct
    0.88
     Survive
    0.79
    ist
    0.76
    ists
    0.75
     Survival
    0.72
     deterrent
    0.71
    Act Density 0.023%

    No Known Activations