INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ミュニ
    0.42
    answ
    0.40
    ampa
    0.39
    reat
    0.39
    жные
    0.38
    umal
    0.38
    unate
    0.37
    Impact
    0.36
     अत
    0.36
    𝐑
    0.35
    POSITIVE LOGITS
    chrom
    0.57
     chrom
    0.55
     chroma
    0.53
     Chroma
    0.50
    persist
    0.48
    krom
    0.48
     Chrom
    0.47
     trom
    0.46
    Chrom
    0.45
     CHROM
    0.44
    Act Density 0.005%

    No Known Activations