INDEX
    Explanations

    different types or categories of items or concepts

    New Auto-Interp
    Negative Logits
    OGND
    -0.86
     queſto
    -0.77
    MigrationBuilder
    -0.77
    [@BOS@]
    -0.75
    <unused28>
    -0.75
    <pad>
    -0.75
    <unused52>
    -0.75
    <unused74>
    -0.75
    <unused41>
    -0.75
    <unused8>
    -0.74
    POSITIVE LOGITS
     stuff
    0.33
     thing
    0.30
     cuestión
    0.30
    K
    0.29
    Reprodução
    0.29
     folks
    0.28
     cosas
    0.28
     ahí
    0.28
     chrétienne
    0.27
     part
    0.27
    Act Density 0.237%

    No Known Activations