INDEX
    Explanations

    instances of the word "change" and its variations

    New Auto-Interp
    Negative Logits
    rose
    -0.21
    ness
    -0.18
    nett
    -0.17
    çĦ¶
    -0.17
    rots
    -0.17
    ly
    -0.17
    turned
    -0.16
    -China
    -0.16
    _changes
    -0.16
    als
    -0.15
    POSITIVE LOGITS
    over
    0.35
    able
    0.29
    overs
    0.24
    /new
    0.21
    /update
    0.20
    ability
    0.20
    /add
    0.20
    cate
    0.18
    iator
    0.18
    Maker
    0.17
    Act Density 0.084%

    No Known Activations