INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    When
    -2.30
    After
    -2.25
    There
    -2.20
    While
    -2.09
    Just
    -2.06
    Being
    -2.00
    Going
    -1.93
    Those
    -1.91
    Even
    -1.91
    Despite
    -1.91
    POSITIVE LOGITS
    に入れ
    1.78
    の部分
    1.70
     filhos
    1.65
    な感じ
    1.63
    だけではなく
    1.59
     antigos
    1.59
    部分が
    1.58
    </strong>
    1.57
    </u>
    1.56
     tranquilo
    1.55
    Act Density 0.006%

    No Known Activations