INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     themſelves
    -0.99
     myſelf
    -0.88
    UrlResolution
    -0.83
     Saxons
    -0.82
     Rumania
    -0.82
    irical
    -0.79
     Huguen
    -0.79
     Svensk
    -0.78
     mariée
    -0.78
    ]")]
    -0.76
    POSITIVE LOGITS
     city
    2.17
     City
    2.09
     cities
    1.94
     CITY
    1.91
    City
    1.75
    city
    1.75
     Cities
    1.72
    cities
    1.64
    CITY
    1.63
    Cities
    1.61
    Act Density 0.037%

    No Known Activations