INDEX
    Explanations

    proper names, particularly those of individuals and brands

    New Auto-Interp
    Negative Logits
    ymoon
    -0.15
     habit
    -0.15
    سÙĪ
    -0.14
    plen
    -0.14
    leta
    -0.14
    marsh
    -0.14
    ados
    -0.14
     ro
    -0.13
    bib
    -0.13
    utral
    -0.13
    POSITIVE LOGITS
    -value
    0.17
    å̼
    0.17
     VALUE
    0.17
     value
    0.16
     values
    0.16
    VALUE
    0.16
     viol
    0.15
    )value
    0.15
    value
    0.15
     Value
    0.15
    Act Density 0.020%

    No Known Activations