INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    <bos>
    -3.56
    -0.80
     springfox
    -0.73
    @[+][
    -0.71
    ,
    -0.69
     bezeichneter
    -0.67
    GEBURTSDATUM
    -0.67
     дописавши
    -0.66
     for
    -0.66
     in
    -0.65
    POSITIVE LOGITS
     maroc
    1.52
     cannes
    1.29
     ibiza
    1.25
     brava
    1.23
     cioc
    1.23
     incess
    1.20
     ananas
    1.19
     loto
    1.19
     marte
    1.18
     milano
    1.16
    Act Density 0.148%

    No Known Activations