INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     {
    0.40
    s
    0.39
    6
    0.39
    })
    0.39
    erequisite
    0.37
    8
    0.37
    threaded
    0.35
    ý
    0.35
    std
    0.34
    ufficient
    0.34
    POSITIVE LOGITS
     feminism
    0.48
     capitalism
    0.47
     colonialism
    0.47
     turismo
    0.47
     tourisme
    0.46
     politics
    0.45
     homelessness
    0.44
     mayhem
    0.44
     Catholicism
    0.44
     komunikasi
    0.43
    Act Density 0.273%

    No Known Activations