INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    tomy
    -0.56
    eto
    -0.54
    Nix
    -0.52
    ete
    -0.47
     policemen
    -0.46
    ucchini
    -0.46
    hockey
    -0.45
     Eure
    -0.45
     Lute
    -0.45
    Vex
    -0.44
    POSITIVE LOGITS
     brand
    2.09
    brand
    1.83
    Brand
    1.83
     Brand
    1.81
     BRAND
    1.72
    BRAND
    1.63
     brands
    1.48
    Brands
    1.43
     Brands
    1.40
    brands
    1.36
    Act Density 0.004%

    No Known Activations