INDEX
    Explanations

    occurrences of the word "contain" and its variations

    New Auto-Interp
    Negative Logits
    ſſung
    -0.80
    <unused68>
    -0.79
    <unused23>
    -0.79
    <unused52>
    -0.79
    <unused51>
    -0.79
    <unused14>
    -0.79
    [@BOS@]
    -0.79
    <unused8>
    -0.78
    <unused3>
    -0.78
    <pad>
    -0.78
    POSITIVE LOGITS
     contain
    0.68
     containing
    0.63
     contains
    0.62
     contienen
    0.54
     contiene
    0.53
    contain
    0.51
     written
    0.49
     zawiera
    0.47
     bevat
    0.47
     содержит
    0.47
    Act Density 0.172%

    No Known Activations