INDEX
    Explanations

    references to diverse groups or categories

    New Auto-Interp
    Negative Logits
     Hens
    -0.81
     Henning
    -0.74
    authorised
    -0.72
    ocks
    -0.70
    ians
    -0.70
    ctuations
    -0.70
    tuuri
    -0.70
    gasus
    -0.69
     Norma
    -0.69
    complexContent
    -0.69
    POSITIVE LOGITS
     Varieties
    0.83
    GOTREF
    0.76
     varieties
    0.71
     Spice
    0.71
     aihe
    0.71
     variety
    0.68
     Winf
    0.67
    دری
    0.67
     subje
    0.67
     Ronde
    0.66
    Act Density 0.009%

    No Known Activations