INDEX
    Explanations

    mentions of factories

    New Auto-Interp
    Negative Logits
    laus
    -0.89
    soever
    -0.78
    ĺħ
    -0.74
    ï¸
    -0.74
    lihood
    -0.72
    thood
    -0.66
     Liberties
    -0.65
    partisan
    -0.63
    theless
    -0.63
     venerable
    -0.61
    POSITIVE LOGITS
     factory
    1.11
    actory
    1.00
    orer
    0.86
    rador
    0.79
     worker
    0.77
    rats
    0.76
     Worker
    0.76
    arde
    0.75
     builder
    0.75
     factories
    0.74
    Act Density 0.015%

    No Known Activations