INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Référence
    -0.57
    <bos>
    -0.47
    laund
    -0.46
    tagext
    -0.42
    antum
    -0.42
     reman
    -0.41
      (
    -0.41
    arschu
    -0.39
     Star
    -0.39
     Las
    -0.39
    POSITIVE LOGITS
    OGND
    0.71
     للمعارف
    0.67
    tanleria
    0.63
    Hochspringen
    0.60
    нциклопедия
    0.59
     OFDb
    0.59
    appé
    0.58
    covite
    0.58
    negan
    0.57
     MotionEvent
    0.56
    Act Density 0.496%

    No Known Activations