INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    .network
    -0.08
     whistle
    -0.08
    page
    -0.07
    Positive
    -0.07
     enable
    -0.07
     enjoyable
    -0.07
     enabled
    -0.07
     station
    -0.07
     designers
    -0.07
     traction
    -0.07
    POSITIVE LOGITS
     sacrifice
    0.12
     sacrifices
    0.11
     sacrificed
    0.09
     Sacr
    0.08
     sacrificing
    0.08
     Mort
    0.07
     sacr
    0.07
     Cut
    0.07
    ]--;↵
    0.07
     тому
    0.07
    Act Density 0.006%

    No Known Activations