INDEX
    Explanations

    significant nouns and specific variables in contexts related to events or categories

    New Auto-Interp
    Negative Logits
    V
    -0.22
     V
    -0.20
    K
    -0.17
    .V
    -0.16
    652
    -0.16
    µ
    -0.15
    Super
    -0.15
    v
    -0.15
     dep
    -0.15
    super
    -0.15
    POSITIVE LOGITS
    olini
    0.17
    xes
    0.17
     Hamilton
    0.15
    ASTER
    0.15
    anten
    0.14
    aster
    0.14
    BH
    0.14
     Ham
    0.14
    ham
    0.14
    °
    0.13
    Act Density 0.056%

    No Known Activations