INDEX
    Explanations

    sports, study, cats, diversity

    New Auto-Interp
    Negative Logits
     kiddos
    0.77
     badass
    0.73
     underwhelming
    0.70
     moniker
    0.68
     inherently
    0.65
     workaround
    0.65
     Leveraging
    0.63
     leveraging
    0.62
     shenanigans
    0.62
     bolstering
    0.62
    POSITIVE LOGITS
     newspapers
    0.64
     businessmen
    0.64
    spapers
    0.59
     policemen
    0.58
     famous
    0.57
     clothes
    0.55
    clothes
    0.55
     Businessman
    0.54
     berühm
    0.52
     spoilt
    0.52
    Act Density 0.020%

    No Known Activations