INDEX
    Explanations

    negations and comparative phrases

    New Auto-Interp
    Negative Logits
    onium
    -0.14
    anson
    -0.12
    plements
    -0.12
    -нибÑĥдÑĮ
    -0.12
    íģ¼
    -0.12
    itÄĽ
    -0.12
     Broad
    -0.12
     ÑģÑĤаÑĢа
    -0.12
    yped
    -0.11
    _phys
    -0.11
    POSITIVE LOGITS
     only
    0.85
     ONLY
    0.74
    only
    0.74
     solely
    0.72
     Only
    0.66
    Only
    0.65
    _only
    0.64
    ONLY
    0.61
     ÑĤолÑĮко
    0.60
    -only
    0.60
    Act Density 0.466%

    No Known Activations