INDEX
    Explanations

    references to girls, particularly in contexts related to clothing and fashion

    New Auto-Interp
    Negative Logits
    yntaxException
    -0.84
    $")
    -0.81
    )”
    -0.79
    ciutto
    -0.77
    %"
    -0.76
    )");
    
    -0.75
    "';
    -0.74
    ]"
    -0.73
    ”،
    -0.73
    ´)
    -0.73
    POSITIVE LOGITS
     girls
    1.86
     Girls
    1.75
    Girls
    1.73
     GIRLS
    1.66
    girls
    1.61
     girl
    1.60
     Girl
    1.51
    girl
    1.50
     GIRL
    1.49
    Girl
    1.44
    Act Density 0.038%

    No Known Activations