INDEX
    Explanations

    Web-scraped text

    New Auto-Interp
    Negative Logits
     отв
    -0.07
     Dere
    -0.06
    "fmt
    -0.06
    -reg
    -0.06
    UIS
    -0.06
     Tak
    -0.06
    Baseline
    -0.06
     thức
    -0.06
     kop
    -0.06
    -0.06
    POSITIVE LOGITS
     применения
    0.08
    (phase
    0.07
    (endpoint
    0.07
     pound
    0.07
     confirming
    0.07
    inan
    0.06
    .getSimpleName
    0.06
     للس
    0.06
    ged
    0.06
     deben
    0.06
    Act Density 0.005%

    No Known Activations