INDEX
    Explanations

    proper nouns and more technical or specific terms

    words related to specific measurements or statistical terms

    New Auto-Interp
    Head Attr Weights
    0:0.12
    1:0.02
    2:0.39
    3:0.05
    4:0.06
    5:0.05
    6:0.04
    7:0.02
    8:0.04
    9:0.06
    10:0.07
    11:0.03
    Negative Logits
    ctrl
    -1.19
    Magikarp
    -1.19
    ocument
    -1.12
    Recipe
    -1.08
     impart
    -1.07
     loud
    -1.06
     contra
    -1.05
    swer
    -1.04
    REDACTED
    -1.04
    fw
    -1.04
    POSITIVE LOGITS
    heed
    1.36
    uden
    1.35
    imer
    1.35
    warm
    1.30
    itsch
    1.30
    MpServer
    1.27
    ascus
    1.24
    hart
    1.22
    emon
    1.20
    gren
    1.18
    Act Density 0.032%

    No Known Activations