INDEX
    Explanations

    rhetoric, rhetorical devices, strategies

    New Auto-Interp
    Negative Logits
    ä
    1.33
    ui
    1.10
    io
    1.08
    os
    1.02
    ö
    0.98
    ede
    0.94
    es
    0.93
    ling
    0.93
    f
    0.92
    ies
    0.91
    POSITIVE LOGITS
    )。
    0.95
    0.91
    0.91
     catég
    0.89
    W
    0.86
    0.86
    0.86
     encontra
    0.86
    ني
    0.85
    )’
    0.85
    Act Density 0.029%

    No Known Activations