INDEX
    Explanations

    the word "ress" followed by a high activation value, particularly "Tress"

    instances of the word "press" and its variations

    New Auto-Interp
    Negative Logits
    ©¶æ
    -0.83
    ãĥ£
    -0.80
    £ı
    -0.75
    vernment
    -0.68
    rily
    -0.66
     subp
    -0.65
    prus
    -0.64
     volunte
    -0.64
    lder
    -0.63
     hemisphere
    -0.62
    POSITIVE LOGITS
    ively
    1.06
    ions
    1.05
    ional
    0.99
    encer
    0.91
    ants
    0.91
    IVE
    0.89
    entials
    0.86
    entially
    0.86
    ives
    0.86
    mann
    0.86
    Act Density 0.018%

    No Known Activations