INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    DockStyle
    -0.70
     Diſ
    -0.59
    évaluateur
    -0.57
     purpoſe
    -0.56
    ISupport
    -0.56
    Билгалдахарш
    -0.55
    BorderFactory
    -0.54
    DidLoad
    -0.52
    dillos
    -0.51
    ckså
    -0.51
    POSITIVE LOGITS
     Pauline
    1.09
    Pauline
    1.07
    antine
    0.92
     Saline
    0.91
     Josephine
    0.88
     Gina
    0.83
    phine
    0.81
    Gina
    0.79
    maine
    0.77
     Levine
    0.72
    Act Density 0.006%

    No Known Activations