INDEX
    Explanations

    the word "pop" and its variations, indicating references to rising or increasing popularity or visibility

    New Auto-Interp
    Negative Logits
    }{*}{}
    -0.93
    )$\\
    -0.84
    )");
    
    -0.83
     %}
    -0.80
    \}\\
    -0.80
    "}")
    -0.80
    }\]
    -0.79
    "):
    
    -0.78
    __":
    
    -0.78
    hdashline
    -0.78
    POSITIVE LOGITS
     pop
    1.26
     Pop
    1.20
    pop
    1.17
    Pop
    1.17
     pops
    1.14
     POP
    1.13
     popping
    1.07
     popped
    1.02
     Popo
    1.00
    POP
    1.00
    Act Density 0.011%

    No Known Activations