INDEX
    Explanations

    variations of the word "roll" in different contexts

    New Auto-Interp
    Negative Logits
    rend
    -0.71
    TAG
    -0.69
    eal
    -0.67
    iem
    -0.67
    yrinth
    -0.66
    len
    -0.64
     comprom
    -0.64
    ld
    -0.64
    unction
    -0.63
    oppers
    -0.63
    POSITIVE LOGITS
     out
    0.77
     prevail
    0.64
     onward
    0.63
     Out
    0.61
    numbered
    0.59
    baugh
    0.57
     down
    0.57
    out
    0.57
    tered
    0.56
     thunder
    0.56
    Act Density 0.022%

    No Known Activations