INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     türlü
    -0.50
     Brüder
    -0.49
    putable
    -0.49
    ideration
    -0.48
    upply
    -0.47
    avía
    -0.47
    leggings
    -0.47
    pullover
    -0.46
    ufficient
    -0.46
     bezeichneter
    -0.46
    POSITIVE LOGITS
     Ms
    1.08
    Ms
    1.04
     ms
    0.89
     noyau
    0.83
     Cfr
    0.82
     convenable
    0.80
     MS
    0.78
     Ams
    0.76
     renforcé
    0.73
     rafra
    0.72
    Act Density 0.027%

    No Known Activations