INDEX
    Explanations

    specific formatting and numerical structures within the text

    New Auto-Interp
    Negative Logits
    stdc
    -0.63
    AsUp
    -0.62
    UserScript
    -0.56
    delwed
    -0.54
    complexContent
    -0.54
    TypedDataSet
    -0.50
     surla
    -0.49
    Autoritní
    -0.49
    Aholisi
    -0.49
    fromnode
    -0.48
    POSITIVE LOGITS
     Fieber
    0.41
     múltiple
    0.37
    itäten
    0.36
     Leicht
    0.35
     chevaux
    0.35
     Wünsche
    0.35
     inaccuracies
    0.35
     lápis
    0.35
     Libertad
    0.34
     curie
    0.34
    Act Density 0.012%

    No Known Activations