INDEX
    Explanations

    ways to improve or contribute to something

    phrases emphasizing the importance of making something better or improving it

    New Auto-Interp
    Negative Logits
    xus
    -0.75
    orsi
    -0.67
    lihood
    -0.66
    isks
    -0.65
    pta
    -0.64
    plings
    -0.64
    76561
    -0.63
    been
    -0.63
    bryce
    -0.63
     reserves
    -0.62
    POSITIVE LOGITS
     happen
    1.12
     disappear
    0.99
     worthwhile
    0.96
     believable
    0.96
     manageable
    0.93
     easier
    0.92
     sustainable
    0.89
     habitable
    0.89
     accessible
    0.88
     smoother
    0.88
    Act Density 0.125%

    No Known Activations