INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Professors
    0.40
    Professor
    0.40
     Festivals
    0.40
     professor
    0.40
    教授
    0.39
     Professor
    0.39
     professors
    0.38
     Restaurants
    0.38
     JAR
    0.37
    💼
    0.37
    POSITIVE LOGITS
     bus
    0.92
     buses
    0.91
     busses
    0.83
     автобу
    0.81
    bus
    0.75
     ônibus
    0.73
     Buses
    0.70
    Bus
    0.68
     buss
    0.66
     autobus
    0.66
    Act Density 0.008%

    No Known Activations