INDEX
    Explanations

    references to the name "Victor" with varying strengths of activation

    New Auto-Interp
    Negative Logits
    shed
    -0.80
    BOOK
    -0.76
    earance
    -0.74
    eworld
    -0.74
    eling
    -0.74
    etary
    -0.73
    dress
    -0.72
    ness
    -0.71
    lease
    -0.70
    STER
    -0.70
    POSITIVE LOGITS
     Hugo
    0.98
     Victor
    0.95
    ians
    0.87
    orian
    0.86
    ancouver
    0.85
    iana
    0.85
    inus
    0.83
     Yanuk
    0.83
    ines
    0.82
    inian
    0.82
    Act Density 0.021%

    No Known Activations