INDEX
    Explanations

    phrases indicating ability and the presence of specific subjects or entities

    future actions or states

    New Auto-Interp
    Negative Logits
     betweenstory
    -0.54
    StructEnd
    -0.53
    󠁴
    -0.51
    ImageContext
    -0.48
    bootstrapcdn
    -0.44
    -0.43
    +:+
    -0.43
    Ӕ
    -0.43
    SharedDtor
    -0.43
    ItemBackground
    -0.42
    POSITIVE LOGITS
     zwiſchen
    0.55
     dieſes
    0.51
     Hyp
    0.51
     SPH
    0.50
     imagui
    0.50
    Stag
    0.50
     dieſen
    0.49
    Nix
    0.49
     okuyayım
    0.48
     dieſem
    0.48
    Act Density 0.031%

    No Known Activations