INDEX
    Explanations

    references to the concept of "world" across various contexts

    New Auto-Interp
    Negative Logits
    ugier
    -0.44
     Clever
    -0.42
    uable
    -0.41
    -0.41
    colas
    -0.41
     prepared
    -0.40
    -0.40
     ingenious
    -0.40
     thông
    -0.40
     Safe
    -0.40
    POSITIVE LOGITS
     world
    0.81
     sphere
    0.81
     mundos
    0.80
     orbit
    0.77
     wereld
    0.76
    world
    0.75
    sphere
    0.74
     worlds
    0.72
     esferas
    0.72
    worlds
    0.71
    Act Density 0.227%

    No Known Activations