INDEX
    Explanations

    Intensifying adjectives

    New Auto-Interp
    Negative Logits
    -0.06
    -0.06
    ์โ
    -0.06
     визнача
    -0.06
     dod
    -0.06
    انس
    -0.06
    א
    -0.06
     бра
    -0.06
    вала
    -0.06
     budeme
    -0.06
    POSITIVE LOGITS
    _DIFF
    0.07
     locking
    0.07
    _exp
    0.07
     french
    0.06
    	seq
    0.06
    !")
    0.06
    ButtonTitles
    0.06
     perceived
    0.06
     ela
    0.06
     kernel
    0.06
    Act Density 0.047%

    No Known Activations