INDEX
    Explanations

    references to articles or posts

    New Auto-Interp
    Negative Logits
     acceptez
    -0.39
     appré
    -0.33
    Siempre
    -0.32
     genellikle
    -0.32
     Schluss
    -0.32
     Pohl
    -0.31
    ACKNOWLEDGMENTS
    -0.31
     agradecer
    -0.31
     Grüßen
    -0.31
    Olá
    -0.30
    POSITIVE LOGITS
     Article
    0.99
    Article
    0.96
     article
    0.89
     Articles
    0.85
     articles
    0.82
    <unused14>
    0.82
    <unused74>
    0.82
    <unused51>
    0.81
    <unused8>
    0.81
    [@BOS@]
    0.81
    Act Density 0.005%

    No Known Activations