INDEX
    Explanations

    mentions of the word "rap" at varying strengths of activation

    instances of the word "rap" in various contexts

    New Auto-Interp
    Negative Logits
     wills
    -0.75
     Takeru
    -0.67
    bright
    -0.63
     Mississ
    -0.62
     recall
    -0.62
     Jong
    -0.59
     Schwarz
    -0.59
     Lauder
    -0.58
     Garg
    -0.58
    natureconservancy
    -0.58
    POSITIVE LOGITS
    odcast
    1.15
    rap
    1.09
    hene
    0.96
    olitics
    0.94
    heny
    0.94
    artisan
    0.89
    ixel
    0.88
    inion
    0.87
    ascal
    0.86
    oline
    0.85
    Act Density 0.007%

    No Known Activations