INDEX
    Explanations

    phrases related to transformation or conversion

    instances of the word "a" or phrases indicating transformation into something new

    New Auto-Interp
    Negative Logits
    times
    -0.78
    enance
    -0.77
    agree
    -0.77
    alian
    -0.73
    alty
    -0.72
    acid
    -0.72
    books
    -0.68
    words
    -0.68
    bots
    -0.68
    sun
    -0.66
    POSITIVE LOGITS
     manageable
    0.98
     profitable
    0.89
     cohesive
    0.86
     usable
    0.85
     nightmare
    0.85
     viable
    0.83
     vortex
    0.82
     caricature
    0.82
     respectable
    0.81
     lifelong
    0.81
    Act Density 0.152%

    No Known Activations