INDEX
    Explanations

    the occurrence of the word "screen" and its variations in various contexts

    New Auto-Interp
    Negative Logits
    adam
    -0.07
    alia
    -0.06
    elay
    -0.06
     Fil
    -0.06
    stro
    -0.06
    cadena
    -0.06
    essler
    -0.06
     ourselves
    -0.06
    Scope
    -0.06
    acid
    -0.06
    POSITIVE LOGITS
    ãĥ¼ãĥ¬
    0.07
    izable
    0.07
    inalg
    0.07
    лаз
    0.07
    INY
    0.07
    /at
    0.06
    оиÑĤ
    0.06
    IVED
    0.06
    arrants
    0.06
    ãĥ¼ãĥ«
    0.06
    Act Density 0.005%

    No Known Activations