INDEX
    Explanations

    phrases related to the concept of something working or not working

    instances of the word "work" and its variations in different contexts

    New Auto-Interp
    Negative Logits
    pora
    -0.76
    ilings
    -0.73
    anamo
    -0.71
    antha
    -0.71
    gow
    -0.65
    ensor
    -0.64
    gart
    -0.64
    olic
    -0.64
    agin
    -0.64
    xual
    -0.62
    POSITIVE LOGITS
    heet
    1.09
    bench
    0.99
    hops
    0.93
     overtime
    0.90
     miracles
    0.88
     seamlessly
    0.86
     wonders
    0.84
     smoothly
    0.83
     flaw
    0.82
     differently
    0.82
    Act Density 0.059%

    No Known Activations