INDEX
    Explanations

    references to car brands and some related terms

    New Auto-Interp
    Negative Logits
    Cro
    -0.88
     Osw
    -0.86
    nian
    -0.85
     yarn
    -0.80
    wu
    -0.80
    aina
    -0.80
     Meow
    -0.79
     Cro
    -0.79
    oire
    -0.77
     Witches
    -0.73
    POSITIVE LOGITS
     automotive
    2.07
     automakers
    2.03
     driver
    1.96
     drivers
    1.95
     Drivers
    1.93
     automobile
    1.87
     Driver
    1.87
     cars
    1.87
     Volvo
    1.85
    Driver
    1.83
    Act Density 0.855%

    No Known Activations