INDEX
    Explanations

    references to specific individuals or names

    New Auto-Interp
    Negative Logits
    šen
    -0.15
    903
    -0.15
    PLAIN
    -0.14
    uÄį
    -0.14
    ober
    -0.14
    ourcem
    -0.14
    evin
    -0.14
    éģĬ
    -0.14
    erin
    -0.14
    åĩºåĶ®
    -0.14
    POSITIVE LOGITS
    imore
    0.15
     Temple
    0.15
     Tro
    0.14
    zek
    0.14
    (--
    0.14
    boot
    0.14
    jr
    0.14
    prt
    0.14
     (#
    0.14
    bootstrap
    0.13
    Act Density 0.051%

    No Known Activations