INDEX
    Explanations

    references to specific historical figures or groups

    references to the comedy group Monty Python

    New Auto-Interp
    Negative Logits
     Painter
    -0.74
    ratulations
    -0.72
    redo
    -0.71
    GREEN
    -0.68
    drm
    -0.67
     deduction
    -0.63
     attribution
    -0.60
    verb
    -0.60
     deductions
    -0.59
    ARP
    -0.59
    POSITIVE LOGITS
    rules
    0.74
    atl
    0.70
    ository
    0.69
    ouf
    0.69
    arella
    0.66
    ethy
    0.65
    helle
    0.64
    liam
    0.61
    ollar
    0.61
    ague
    0.58
    Act Density 0.165%

    No Known Activations