INDEX
    Explanations

    instances where the word "hi" is present with a high activation value, potentially indicating a specific focus on this word

    repetitions of the phrase "hi."

    New Auto-Interp
    Negative Logits
     Cosponsors
    -0.77
     Aven
    -0.72
     Sorceress
    -0.68
     Izan
    -0.65
     convol
    -0.64
     Jenner
    -0.64
    ilater
    -0.62
     Euph
    -0.62
    orative
    -0.62
    NetMessage
    -0.60
    POSITIVE LOGITS
    hi
    1.16
    ya
    1.07
    emen
    0.98
    hei
    0.95
    roth
    0.93
    oga
    0.90
    emi
    0.89
    wa
    0.89
    omo
    0.87
    oka
    0.87
    Act Density 0.006%

    No Known Activations