INDEX
    Explanations

    references to the brand "Polo" and related high-fashion terminology

    New Auto-Interp
    Negative Logits
    aina
    -0.15
    çĩ
    -0.14
     cous
    -0.14
    quat
    -0.14
     martial
    -0.14
    ìĦĿ
    -0.14
     Serif
    -0.14
    okie
    -0.14
     dance
    -0.14
     Ninja
    -0.14
    POSITIVE LOGITS
     polo
    0.47
     Polo
    0.41
    pol
    0.28
     pon
    0.25
     horses
    0.25
     pony
    0.24
     pol
    0.23
     horse
    0.23
     Pol
    0.23
    POL
    0.22
    Act Density 0.001%

    No Known Activations