INDEX
    Explanations

    references to specific locations or people with the term "sa" in them

    references to a specific individual, likely a prominent figure

    New Auto-Interp
    Negative Logits
     furious
    -0.69
     Hex
    -0.67
     neck
    -0.66
     Furious
    -0.65
     degener
    -0.63
     interactions
    -0.63
     met
    -0.62
     empath
    -0.62
     gears
    -0.61
     batter
    -0.60
    POSITIVE LOGITS
    sa
    4.51
    si
    1.63
    sam
    1.49
    SA
    1.41
    sin
    1.34
    sha
    1.32
    sal
    1.29
    Sa
    1.27
    sb
    1.24
    sg
    1.21
    Act Density 0.007%

    No Known Activations