INDEX
    Explanations

    words related to darkness or obscurity

    the token representations of the phrase "dim" and its variations in various contexts

    New Auto-Interp
    Negative Logits
    OUP
    -0.74
    REDACTED
    -0.72
    CRIP
    -0.71
     Aval
    -0.69
    ALLY
    -0.68
    AIN
    -0.67
    Untitled
    -0.65
     Lucia
    -0.65
    govtrack
    -0.64
    OAD
    -0.64
    POSITIVE LOGITS
    inished
    1.53
    ensions
    1.33
    ples
    1.32
    ethy
    1.29
    ming
    1.24
    itri
    1.12
    ension
    1.11
    mers
    1.10
    orph
    1.10
    pling
    1.09
    Act Density 0.047%

    No Known Activations