INDEX
    Explanations

    Roman numeral II

    New Auto-Interp
    Negative Logits
    get
    -0.07
    and
    -0.07
    built
    -0.07
    set
    -0.06
    share
    -0.06
    San
    -0.06
    Top
    -0.06
    aming
    -0.06
    の大
    -0.06
     Free
    -0.06
    POSITIVE LOGITS
     II
    0.21
    II
    0.18
     III
    0.18
    -II
    0.15
    III
    0.14
     iii
    0.13
     ii
    0.13
     VII
    0.13
    ii
    0.12
     XII
    0.12
    Act Density 0.023%

    No Known Activations