INDEX
    Explanations

    phrases related to politics, legislation, and international affairs

    terms related to classification and categorization, particularly in a societal or systemic context

    New Auto-Interp
    Negative Logits
     Leilan
    -0.50
     Io
    -0.50
     respectively
    -0.50
    $$$$
    -0.45
     Mara
    -0.45
    spin
    -0.44
     goodbye
    -0.42
     Abu
    -0.42
     Fitz
    -0.41
    代
    -0.40
    POSITIVE LOGITS
    issions
    0.60
    ibilities
    0.59
    ruct
    0.58
    Ĩ
    0.57
    ential
    0.57
    atures
    0.57
    itives
    0.56
    itionally
    0.56
    ensed
    0.56
    ework
    0.55
    Act Density 0.389%

    No Known Activations