INDEX
    Explanations

    examples provided are not sufficient to determine a specific pattern or preference for this neuron

    the letter "g" in various contexts

    New Auto-Interp
    Negative Logits
     sclerosis
    -0.71
     derail
    -0.69
     regist
    -0.66
     moderator
    -0.65
     Wonderful
    -0.63
    Rated
    -0.63
     intern
    -0.62
     booster
    -0.61
     kickoff
    -0.59
    FINE
    -0.59
    POSITIVE LOGITS
    asp
    1.00
    raphics
    0.95
    ardless
    0.92
    uild
    0.84
    ascript
    0.83
    bags
    0.81
    ars
    0.79
    athering
    0.79
    ods
    0.78
    oths
    0.77
    Act Density 0.010%

    No Known Activations