INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Horowitz
    -0.71
     Vanity
    -0.68
     Loft
    -0.67
     Breitbart
    -0.65
    Ped
    -0.65
     sidewalk
    -0.65
     Sloan
    -0.65
     Woodward
    -0.65
     Guilty
    -0.64
     Eater
    -0.64
    POSITIVE LOGITS
    uania
    1.40
     Netherlands
    1.37
     Islands
    1.25
     Philippines
    1.25
    Spain
    1.23
    France
    1.20
     Territories
    1.19
    orea
    1.19
     Yugoslavia
    1.18
     Argentina
    1.17
    Act Density 0.442%

    No Known Activations