INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     percepción
    -0.08
    老板
    -0.08
     perception
    -0.08
     trumpet
    -0.08
     estimation
    -0.08
    所在
    -0.08
     bombard
    -0.08
     monarch
    -0.08
    εργ
    -0.07
    Islam
    -0.07
    POSITIVE LOGITS
     shelf
    0.08
     Bridge
    0.08
    _bridge
    0.08
     Francisco
    0.08
     filmes
    0.08
     lagu
    0.08
    added
    0.08
    ленные
    0.07
    lene
    0.07
     Args
    0.07
    Act Density 0.002%

    No Known Activations