INDEX
    Explanations

    phrases indicating potential actions or capabilities

    New Auto-Interp
    Negative Logits
    ä¹ĥ
    -0.17
    ospace
    -0.16
    /fixtures
    -0.15
    eti
    -0.15
    à¸
    -0.15
    gii
    -0.14
    Serialized
    -0.14
    odox
    -0.14
    detect
    -0.14
    icom
    -0.14
    POSITIVE LOGITS
     seen
    0.28
     found
    0.28
    found
    0.27
    seen
    0.25
    -found
    0.24
     FOUND
    0.24
     viewed
    0.24
     Found
    0.23
     Seen
    0.22
    Found
    0.22
    Act Density 0.030%

    No Known Activations