INDEX
    Explanations

    pronoun followed by action verb

    New Auto-Interp
    Negative Logits
     हाद
    0.69
    cand
    0.69
    0.66
     milij
    0.66
     besø
    0.66
    敏感
    0.66
     malaise
    0.66
    0.64
     honeymoon
    0.64
    快適
    0.64
    POSITIVE LOGITS
     lung
    1.33
     roared
    1.16
     unleashing
    1.08
     roar
    1.02
     hurled
    1.00
     screamed
    1.00
    Lung
    0.96
     roaring
    0.96
     shouted
    0.96
     snar
    0.95
    Act Density 0.132%

    No Known Activations