INDEX
    Explanations

    references to a specific person's name

    New Auto-Interp
    Negative Logits
    ivity
    -0.80
    ICAN
    -0.70
    ivism
    -0.69
    ivities
    -0.65
     Yugoslav
    -0.64
     Lisbon
    -0.63
    ential
    -0.63
    hs
    -0.62
    REDACTED
    -0.62
     Catalan
    -0.61
    POSITIVE LOGITS
    orthy
    0.86
    stown
    0.86
    sey
    0.83
    arty
    0.81
    cock
    0.78
    bare
    0.74
    afort
    0.73
    quist
    0.72
    hattan
    0.71
    ORPG
    0.70
    Act Density 0.047%

    No Known Activations