INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
    odils
    -0.94
     emplea
    -0.94
     Requested
    -0.91
     extiende
    -0.89
     Reduced
    -0.88
     ostrich
    -0.87
         
    -0.85
     límite
    -0.84
     obtiene
    -0.84
     establece
    -0.82
    POSITIVE LOGITS
     speciale
    1.14
    1.13
     fiore
    1.09
     profiter
    1.09
     familie
    1.09
    johtaja
    1.09
     soldat
    1.07
     potes
    1.07
     gik
    1.06
     capot
    1.05
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.