INDEX
    Explanations

    attends to the token "you" from the token "I" in various response contexts

    New Auto-Interp
    Head Attr Weights
    0:0.05
    1:0.06
    2:0.05
    3:0.09
    4:0.09
    5:0.05
    6:0.50
    7:0.08
    Negative Logits
    ]")]
    -0.51
    __(/*!
    -0.46
    setupUi
    -0.45
    stdafx
    -0.42
    Tembelea
    -0.42
    "]))
    -0.41
    fxml
    -0.40
    存于互联网档案馆
    -0.39
     betweenstory
    -0.39
    */;
    -0.38
    POSITIVE LOGITS
     సౌకర్య
    0.40
     yourself
    0.38
     enfans
    0.38
     plais
    0.38
     Redacción
    0.37
     crdi
    0.37
     Abbé
    0.36
     effectivement
    0.36
     Choco
    0.35
     leçons
    0.35
    Act Density 0.336%

    No Known Activations