INDEX
    Explanations

    mentions of elephants

    occurrences of the word "elephant" and its variations

    New Auto-Interp
    Negative Logits
    hub
    -0.82
     Hub
    -0.81
    oplan
    -0.77
    bp
    -0.74
    Hub
    -0.73
    bnb
    -0.72
     hubs
    -0.72
     burgers
    -0.71
    bsp
    -0.71
    wcs
    -0.70
    POSITIVE LOGITS
     Ele
    3.33
    Ele
    2.40
     ele
    2.37
     Elephant
    1.50
    д
    1.44
     ELE
    1.33
     elephants
    1.29
    ele
    1.22
     Alexandra
    1.12
     elephant
    1.10
    Act Density 0.040%

    No Known Activations