INDEX
    Explanations

    attends to website-related tokens from arbitrary tokens that follow, particularly those that contain a specific format

    New Auto-Interp
    Head Attr Weights
    0:0.12
    1:0.14
    2:0.13
    3:0.04
    4:0.05
    5:0.03
    6:0.09
    7:0.37
    Negative Logits
    NameInMap
    -0.53
    EDEFAULT
    -0.39
     springfox
    -0.36
    }));
    
    -0.36
    :✨
    -0.36
    bootstrapcdn
    -0.35
    parseColor
    -0.35
    ")));
    
    -0.34
    ensacola
    -0.34
    encodeWith
    -0.34
    POSITIVE LOGITS
     חיצוניים
    0.34
     réaliste
    0.34
     passagers
    0.33
     acrylique
    0.32
     victime
    0.31
     douceur
    0.31
     psychi
    0.31
    FirstResponder
    0.30
     plomb
    0.30
     Asoci
    0.30
    Act Density 0.031%

    No Known Activations