INDEX
    Explanations

    words related to items that are processed or transformed into something else

    references to future events and expected outcomes

    New Auto-Interp
    Negative Logits
     emphasizing
    -0.68
     emphasizes
    -0.63
    SOURCE
    -0.63
    SourceFile
    -0.62
     Intervention
    -0.62
     injecting
    -0.61
     imposing
    -0.61
     mindful
    -0.60
     Respons
    -0.59
    cffffcc
    -0.59
    POSITIVE LOGITS
     fetch
    1.10
     expire
    1.08
     circulate
    1.07
     langu
    1.07
     belonged
    1.06
     belong
    1.00
     disappear
    0.95
     vanish
    0.95
     undergo
    0.94
     arrive
    0.93
    Act Density 0.313%

    No Known Activations