INDEX
    Explanations

    add/remove/modify features

    New Auto-Interp
    Negative Logits
     estatal
    0.48
     terrestre
    0.47
     engulfed
    0.45
     capitalist
    0.44
     industriale
    0.44
     vilá
    0.44
     world
    0.44
     systému
    0.44
     américaine
    0.43
     industriel
    0.43
    POSITIVE LOGITS
     tweaks
    0.67
    某些
    0.60
     modifications
    0.59
     adjustments
    0.59
     수정
    0.58
    修改
    0.57
     특정
    0.57
    改进
    0.56
     tweaking
    0.56
     modify
    0.55
    Act Density 0.022%

    No Known Activations