INDEX
    Explanations

    dates in the format of year followed by a non-zero activation value

    New Auto-Interp
    Negative Logits
     noisy
    -0.69
    ythm
    -0.65
    ¿½
    -0.64
     endless
    -0.63
    ongh
    -0.62
     plur
    -0.62
     citiz
    -0.62
    onite
    -0.62
    und
    -0.61
     multic
    -0.60
    POSITIVE LOGITS
     UTC
    0.81
     partName
    0.75
    âĶĢ
    0.75
     Referred
    0.75
    raq
    0.69
     |--
    0.68
     Dodge
    0.68
     Apply
    0.68
     ·
    0.68
     RELEASE
    0.67
    Act Density 0.065%

    No Known Activations