INDEX
    Explanations

    proper nouns and technical terms related to news articles or scientific research

    New Auto-Interp
    Negative Logits
    ifax
    -0.48
    usions
    -0.46
    ibur
    -0.41
    sing
    -0.37
    onies
    -0.36
    ingham
    -0.36
    scl
    -0.36
    aring
    -0.35
    rox
    -0.35
    awks
    -0.35
    POSITIVE LOGITS
    BIL
    0.48
    KER
    0.44
    FORE
    0.43
    KE
    0.42
    GER
    0.41
    PRES
    0.40
    EG
    0.38
    Ger
    0.38
    ADE
    0.36
    KA
    0.36
    Act Density 0.035%

    No Known Activations