INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     religiosos
    -0.09
     commemorate
    -0.09
     Captain
    -0.08
     religiosa
    -0.08
     percentages
    -0.08
    posé
    -0.08
     captain
    -0.08
     religious
    -0.08
    aufnahme
    -0.08
     disper
    -0.08
    POSITIVE LOGITS
     quadratic
    0.08
    AIza
    0.08
    0.08
     substit
    0.08
     экзам
    0.08
     Ferrari
    0.08
    cars
    0.07
    .Argument
    0.07
     encar
    0.07
    .SE
    0.07
    Act Density 0.021%

    No Known Activations