INDEX
    Explanations

    references to physical appearance or aesthetics

    New Auto-Interp
    Negative Logits
    /scripts
    -0.17
    hai
    -0.17
    ha
    -0.17
    ácil
    -0.17
    x
    -0.16
    ib
    -0.15
     Pot
    -0.15
    age
    -0.15
    TEGER
    -0.15
     cur
    -0.15
    POSITIVE LOGITS
     Appearance
    0.20
     appearance
    0.19
    Appearance
    0.18
    #af
    0.17
    appearance
    0.17
    _FT
    0.16
    infeld
    0.15
    .ns
    0.15
    anje
    0.15
    æĢĸ
    0.15
    Act Density 0.020%

    No Known Activations