INDEX
    Explanations

    consequences

    New Auto-Interp
    Negative Logits
     consequences
    -1.83
     implications
    -1.64
     ramifications
    -1.49
     Consequences
    -1.30
     repercussions
    -1.30
    Consequences
    -1.23
     Implications
    -1.21
     consecuencias
    -1.13
     conséquences
    -1.13
    Implications
    -1.09
    POSITIVE LOGITS
     for
    0.66
     like
    0.63
    ,
    0.49
    Appear
    0.48
     about
    0.48
    atsen
    0.47
     opening
    0.46
    ندان
    0.45
     to
    0.45
    .
    0.44
    Act Density 0.050%

    No Known Activations