INDEX
    Explanations

    This neuron detects language about breaking free from norms or constraints, especially phrases expressing liberation from “typical confines.”

    New Auto-Interp
    Negative Logits
     categoryId
    -0.07
     Hastings
    -0.07
     γε
    -0.06
    _detalle
    -0.06
     bày
    -0.06
    =""↵
    -0.06
    .me
    -0.06
     rods
    -0.06
     مدير
    -0.06
     dikkate
    -0.06
    POSITIVE LOGITS
    plot
    0.07
    OTOR
    0.07
    시간
    0.07
    λά
    0.07
     tape
    0.06
    0.06
     ERA
    0.06
     plots
    0.06
     voting
    0.06
     Viewer
    0.06
    Act Density 0.002%

    No Known Activations