INDEX
    Explanations

    mentions of specific places or names, particularly "Ras."

    references to specific locations and structures in a city context

    New Auto-Interp
    Negative Logits
    initely
    -0.78
    cember
    -0.77
    orders
    -0.74
    tarians
    -0.74
    tarian
    -0.73
    arya
    -0.70
    rower
    -0.69
    isher
    -0.69
    icum
    -0.69
    marked
    -0.69
    POSITIVE LOGITS
     Blitz
    0.79
    loo
    0.77
    代
    0.76
    ingen
    0.73
     Giuliani
    0.70
    lov
    0.67
    PLE
    0.66
     Tens
    0.66
    Je
    0.64
    Fake
    0.62
    Act Density 0.044%

    No Known Activations