INDEX
    Explanations

    phrases containing the word "comrade"

    mentions of a specific character or figure, particularly indicating admiration or respect

    New Auto-Interp
    Negative Logits
     strict
    -0.73
     straight
    -0.66
     pant
    -0.62
     Birds
    -0.62
     lit
    -0.62
     Ep
    -0.62
     per
    -0.62
     overhead
    -0.61
     ups
    -0.61
     tri
    -0.60
    POSITIVE LOGITS
    rade
    5.15
    rad
    1.26
    merce
    1.26
    rase
    1.11
    rand
    1.09
    rador
    1.08
    rus
    1.07
    opian
    1.05
    rious
    1.01
    uin
    0.99
    Act Density 0.015%

    No Known Activations