INDEX
    Explanations

    topics related to legal cases, political figures, and controversial issues

    topics related to social and political issues

    New Auto-Interp
    Negative Logits
     comr
    -0.74
     mathemat
    -0.66
     incap
    -0.66
    ailability
    -0.64
    ITNESS
    -0.61
    omever
    -0.58
     princ
    -0.57
     secondly
    -0.57
    ModLoader
    -0.57
    orically
    -0.56
    POSITIVE LOGITS
     Replay
    1.18
     }}
    0.85
    <|endoftext|>
    0.83
     }
    0.77
     ï
    0.77
     |
    0.76
     »
    0.74
    }}
    0.74
    >]
    0.74
     ]
    0.73
    Act Density 0.182%

    No Known Activations