INDEX
    Explanations

    references to subjects and topics within the text

    New Auto-Interp
    Negative Logits
    ushing
    -0.17
    ØŃÙĬ
    -0.16
    ardo
    -0.15
    /preferences
    -0.15
    ersions
    -0.15
    uder
    -0.15
    zo
    -0.15
    co
    -0.15
    lear
    -0.15
    usp
    -0.14
    POSITIVE LOGITS
    ivity
    0.44
    ively
    0.42
     matter
    0.39
    matter
    0.35
    ivities
    0.35
    ive
    0.32
    ivism
    0.29
     Matter
    0.29
    ivist
    0.29
    IVE
    0.25
    Act Density 0.015%

    No Known Activations