INDEX
    Explanations

    references to the word "head" in various contexts

    New Auto-Interp
    Negative Logits
     nahilalakip
    -0.81
    UNITY
    -0.72
    solicit
    -0.71
     Urqu
    -0.70
    momix
    -0.69
    ="{{$
    -0.69
     Arora
    -0.68
    rungsseite
    -0.68
    rachtet
    -0.68
    āci
    -0.67
    POSITIVE LOGITS
     HEAD
    2.02
     head
    1.96
     Head
    1.95
     heads
    1.89
    Head
    1.85
     Heads
    1.76
    head
    1.75
    HEAD
    1.70
    heads
    1.60
    Heads
    1.52
    Act Density 0.044%

    No Known Activations