INDEX
    Explanations

    instances of emphasis or significant markers in the text

    New Auto-Interp
    Negative Logits
     /\.(
    -0.99
     дописавши
    -0.90
    AsUp
    -0.81
     الحره
    -0.80
    IntoConstraints
    -0.79
    /**
    -0.79
    +#+#
    -0.72
    Personensuche
    -0.70
    InjectAttribute
    -0.69
    σιμοποι
    -0.69
    POSITIVE LOGITS
     of
    0.59
     compos
    0.52
    הערות
    0.49
     (!)
    0.48
    ocas
    0.45
     capitolo
    0.45
     Tübingen
    0.44
     pozor
    0.43
     gere
    0.42
    </i>
    0.42
    Act Density 0.217%

    No Known Activations