INDEX
    Explanations

    occurrences of the word "dw" and variations thereof

    New Auto-Interp
    Negative Logits
    oyo
    -0.16
    .scalablytyped
    -0.15
    ocop
    -0.15
    avage
    -0.14
     Gree
    -0.14
    amar
    -0.14
     Lans
    -0.14
    åı«
    -0.13
     spit
    -0.13
    .Marshal
    -0.13
    POSITIVE LOGITS
    elling
    0.28
    ight
    0.27
    eller
    0.27
    indle
    0.23
    dw
    0.23
    ells
    0.22
     Dw
    0.21
    IGHT
    0.21
    ights
    0.20
    .dw
    0.20
    Act Density 0.010%

    No Known Activations