INDEX
    Explanations

    terms related to authority and governance

    New Auto-Interp
    Negative Logits
    entin
    -0.16
    ød
    -0.14
    oki
    -0.14
    lish
    -0.14
     fancy
    -0.14
    ек
    -0.14
    ango
    -0.14
    .virtual
    -0.13
    uf
    -0.13
     gaze
    -0.13
    POSITIVE LOGITS
     Larger
    0.26
     larger
    0.24
     largest
    0.22
     large
    0.21
    large
    0.21
     bigger
    0.20
     LARGE
    0.20
    Large
    0.20
    -largest
    0.19
     smaller
    0.19
    Act Density 0.008%

    No Known Activations