INDEX
    Explanations

    words starting with 'wh'

    occurrences of the word "wh."

    New Auto-Interp
    Negative Logits
    rella
    -0.75
     Rein
    -0.73
    alia
    -0.71
     Desert
    -0.66
    tera
    -0.65
     Celeb
    -0.65
    rez
    -0.65
    ovic
    -0.64
     Variant
    -0.64
     Romania
    -0.64
    POSITIVE LOGITS
     wh
    3.50
     Wh
    1.86
    wh
    1.85
    Wh
    1.62
     WH
    1.39
     thw
    1.35
     whipping
    1.16
     whe
    1.10
     th
    1.09
     whale
    1.07
    Act Density 0.008%

    No Known Activations