INDEX
    Explanations

    instances of the word "goes" followed by a number indicating the strength of the activation

    instances of the phrase "goes" in various contexts

    New Auto-Interp
    Negative Logits
    role
    -0.74
    eers
    -0.71
    uctor
    -0.71
    rient
    -0.67
    icon
    -0.67
    cos
    -0.66
    essor
    -0.65
    icons
    -0.62
    eer
    -0.62
    itionally
    -0.62
    POSITIVE LOGITS
    Ń·
    0.96
    verning
    0.86
    vt
    0.83
     Forth
    0.83
    lems
    0.81
    itters
    0.80
    OHN
    0.73
     ashore
    0.73
    uten
    0.73
    ģĸ
    0.71
    Act Density 0.014%

    No Known Activations