INDEX
    Explanations

    adjectives that describe the quality or characteristics of various subjects

    New Auto-Interp
    Negative Logits
    ASN
    -0.16
    ABCDEFG
    -0.16
    ména
    -0.15
    UILTIN
    -0.15
    abcdef
    -0.15
    aco
    -0.15
    onde
    -0.14
    619
    -0.14
    AS
    -0.14
    ãĥ¼ãĥľ
    -0.14
    POSITIVE LOGITS
     as
    0.63
     als
    0.34
     sebagai
    0.30
     как
    0.29
    ä½ľä¸º
    0.26
     ÏīÏĤ
    0.26
     jako
    0.25
    	as
    0.24
     Ñıк
    0.24
    çĤº
    0.23
    Act Density 0.079%

    No Known Activations