INDEX
    Explanations

    references to rankings or hierarchical classifications

    New Auto-Interp
    Negative Logits
    -0.45
    -0.43
    1
    -0.40
     Gol
    -0.39
    â
    -0.39
    ,
    -0.39
    <eos>
    -0.39
    5
    -0.38
    7
    -0.37
    screen
    -0.37
    POSITIVE LOGITS
    AddTagHelper
    1.02
    GEBURTSDATUM
    0.99
     myſelf
    0.92
     increí
    0.89
     indígen
    0.87
     pleaſure
    0.87
     Majefty
    0.85
     faſt
    0.84
     desmotivaciones
    0.83
     ſont
    0.81
    Act Density 0.173%

    No Known Activations