INDEX
    Explanations

    quantities and their descriptors

    New Auto-Interp
    Negative Logits
     ]
    
    -0.60
     betweenstory
    -0.57
    -0.56
     &_
    -0.55
     =>
    
    -0.52
     للغاية
    -0.51
    yled
    -0.50
    couverte
    -0.50
    ]--;
    -0.50
    umumkan
    -0.49
    POSITIVE LOGITS
     people
    0.82
     times
    0.74
     folks
    0.69
     stuff
    0.69
     things
    0.68
     ppl
    0.66
     lotta
    0.65
     fois
    0.64
    times
    0.60
    fjspx
    0.59
    Act Density 0.202%

    No Known Activations