INDEX
    Explanations

    second-person pronouns

    New Auto-Interp
    Negative Logits
     projected
    -0.07
    _latency
    -0.07
    جز
    -0.06
    .pl
    -0.06
    (self
    -0.06
    antages
    -0.06
    (builder
    -0.06
    (users
    -0.06
     semantics
    -0.06
    套餐
    -0.06
    POSITIVE LOGITS
    0.08
    最后
    0.07
    __);
    0.07
    后勤
    0.07
     Ji
    0.07
    民航
    0.07
     Hayden
    0.07
    Gatt
    0.07
    一周
    0.07
     ***!↵
    0.07
    Act Density 0.051%

    No Known Activations