INDEX
    Explanations

    the word "Lo" with the highest activation value

    the recurring mention of the name "Lo."

    New Auto-Interp
    Negative Logits
    ãĥ¥
    -0.73
    aur
    -0.71
    patrick
    -0.70
    ivery
    -0.68
    ahime
    -0.67
    orpor
    -0.66
    ãģĨ
    -0.65
     phosphate
    -0.65
    atomic
    -0.62
    edge
    -0.60
    POSITIVE LOGITS
     Lo
    3.63
    Lo
    2.78
    lo
    1.71
     LO
    1.64
     lo
    1.63
    LO
    1.29
     Ho
    1.26
     Loop
    1.22
     Ro
    1.13
     Lot
    1.10
    Act Density 0.016%

    No Known Activations