INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     vœux
    -0.61
     hjär
    -0.53
     banques
    -0.50
     autonomía
    -0.49
    thschild
    -0.49
     dieux
    -0.49
     diamants
    -0.49
     dirigir
    -0.48
     transparan
    -0.47
     suun
    -0.47
    POSITIVE LOGITS
     cases
    1.86
     Cases
    1.72
    Cases
    1.60
    cases
    1.51
     CASES
    1.41
     casos
    1.36
     Fälle
    1.28
     Fällen
    1.13
     instances
    1.03
     gevallen
    0.98
    Act Density 0.026%

    No Known Activations