INDEX
    Explanations

    Chinese place names and concepts

    New Auto-Interp
    Negative Logits
    0.64
    0.56
    0.56
    0.56
    0.56
    0.55
    0.55
    0.55
    0.55
    0.55
    POSITIVE LOGITS
     
    0.62
    0.53
    0.49
    0.48
     T
    0.46
     '
    0.46
    0.44
     B
    0.44
    0.43
    0.43
    Act Density 0.023%

    No Known Activations