INDEX
    Explanations

    instances where something is discovered or observed

    phrases that indicate the discovery of something unexpected or significant

    New Auto-Interp
    Negative Logits
     derog
    -0.83
    vote
    -0.72
    jong
    -0.70
    ongo
    -0.69
    escal
    -0.69
    xus
    -0.67
    commit
    -0.67
     derivative
    -0.67
    otine
    -0.67
    imo
    -0.66
    POSITIVE LOGITS
     waking
    0.76
     lifeless
    0.72
    LESS
    0.72
     greeted
    0.70
     beautiful
    0.69
    tons
    0.67
     emptiness
    0.67
     behold
    0.66
     reapp
    0.66
     snowy
    0.66
    Act Density 0.200%

    No Known Activations