INDEX
    Explanations

    quotation marks and apostrophes in text

    quotes delimiting strings

    New Auto-Interp
    Negative Logits
    Kanpo
    -0.34
     ouv
    -0.31
    adır
    -0.30
    ()));
    
    -0.29
    ρι
    -0.28
    什么呢
    -0.28
    en
    -0.27
     Circuit
    -0.27
     Quy
    -0.26
     viñ
    -0.26
    POSITIVE LOGITS
     propOrder
    0.77
     виправивши
    0.74
     nakalista
    0.73
    Personendaten
    0.68
    =="
    0.68
    ]=="
    0.68
    ftagPool
    0.68
    iſche
    0.67
     ProtoMessage
    0.67
    iſchen
    0.66
    Act Density 0.010%

    No Known Activations