INDEX
    Explanations

    This neuron detects mentions of government or intelligence actors (e.g., governments, agencies, spies).

    New Auto-Interp
    Negative Logits
    ěst
    -0.06
    _splits
    -0.06
    ôle
    -0.06
    ho
    -0.06
    ména
    -0.06
     Perl
    -0.06
    -0.06
    ât
    -0.06
     Bite
    -0.06
    <Input
    -0.06
    POSITIVE LOGITS
    .active
    0.07
     Canberra
    0.07
     minOccurs
    0.07
    _enter
    0.06
     Kushner
    0.06
    .des
    0.06
    ((↵
    0.06
     без
    0.06
     halluc
    0.06
    нер
    0.06
    Act Density 0.011%

    No Known Activations