INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Philly
    -0.50
     Philly
    -0.48
     antis
    -0.46
    Bris
    -0.46
     Fod
    -0.45
    ombes
    -0.44
     Bris
    -0.44
     Andersson
    -0.43
     Mandy
    -0.43
    жидан
    -0.43
    POSITIVE LOGITS
     Lake
    1.27
    Lake
    1.20
     lake
    1.18
     LAKE
    1.18
     lakes
    1.06
     Lakes
    1.01
    lake
    0.97
    Lakes
    0.96
    LAKE
    0.86
    lakes
    0.82
    Act Density 0.020%

    No Known Activations