INDEX
    Explanations

    references to wealth or affluence

    New Auto-Interp
    Negative Logits
     betweenstory
    -0.76
     CreateTagHelper
    -0.70
    ništ
    -0.65
    WriteLiteral
    -0.64
    lüğü
    -0.62
    dańsk
    -0.62
    illoma
    -0.61
    :✨
    -0.61
    niająca
    -0.61
    ніципа
    -0.60
    POSITIVE LOGITS
     RICH
    0.98
     rich
    0.88
    rich
    0.85
     Rich
    0.84
    Rich
    0.78
    RICH
    0.77
     POOR
    0.76
     Poor
    0.74
    ochet
    0.74
     Ric
    0.66
    Act Density 0.085%

    No Known Activations