INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Normalize
    -0.08
     vet
    -0.07
    σετε
    -0.07
     histó
    -0.06
    ize
    -0.06
    War
    -0.06
     compressed
    -0.06
    rade
    -0.06
     treasury
    -0.06
     ard
    -0.06
    POSITIVE LOGITS
     only
    0.08
     Only
    0.08
     owned
    0.08
    唯一
    0.08
     ONLY
    0.07
    only
    0.07
    -lined
    0.07
    #Region
    0.07
    >("
    0.07
     beiden
    0.07
    Act Density 0.010%

    No Known Activations