INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    AddTagHelper
    -0.99
    Tikang
    -0.96
     Tiefen
    -0.94
    )");
    
    -0.89
     ſeveral
    -0.88
    BibitemShut
    -0.88
    ]").
    -0.84
    Життєпис
    -0.83
     hydrauli
    -0.83
     تضيفلها
    -0.82
    POSITIVE LOGITS
     car
    2.12
     Car
    1.95
    Car
    1.90
    car
    1.87
     cars
    1.79
     CAR
    1.74
     Cars
    1.66
    CAR
    1.63
    Cars
    1.55
    cars
    1.54
    Act Density 0.034%

    No Known Activations