INDEX
    Explanations

    proper nouns related to specific locations or entities

    New Auto-Interp
    Negative Logits
    ãĥĦ
    -0.84
     guiActiveUnfocused
    -0.77
    ãĥĥãĥĪ
    -0.75
    wagen
    -0.71
     Nanto
    -0.68
     Bombay
    -0.65
     Tinder
    -0.64
     compuls
    -0.62
     Belg
    -0.61
    jamin
    -0.60
    POSITIVE LOGITS
    entric
    1.01
    ilia
    0.93
    colo
    0.89
    ada
    0.83
    otte
    0.83
    aught
    0.82
    otine
    0.82
    adas
    0.82
    cone
    0.79
    ivil
    0.78
    Act Density 0.005%

    No Known Activations