INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    󠁮
    -0.56
    󠁣
    -0.53
     ujednoznacz
    -0.50
    Rot
    -0.47
     Nottingham
    -0.46
    actualité
    -0.46
     يتيمه
    -0.46
     tqdm
    -0.45
    Fifth
    -0.45
    borderBottom
    -0.45
    POSITIVE LOGITS
     agency
    2.17
    agency
    2.06
    Agency
    2.05
     Agency
    2.00
     AGENCY
    1.90
     agencies
    1.71
     Agencies
    1.70
    agencies
    1.62
    Agencies
    1.54
     agencia
    1.41
    Act Density 0.005%

    No Known Activations