INDEX
    Explanations

    phrases and words indicating comparisons or changes over time

    New Auto-Interp
    Negative Logits
    azzi
    -0.17
    strup
    -0.17
    erland
    -0.15
     kara
    -0.15
    icator
    -0.15
    .ribbon
    -0.14
    rian
    -0.14
    owler
    -0.14
    .clf
    -0.14
    ToWorld
    -0.14
    POSITIVE LOGITS
    amba
    0.15
    tors
    0.15
    vale
    0.15
    visor
    0.14
    ais
    0.14
    ãĥĮ
    0.14
    anners
    0.14
    åľŃ
    0.14
    KIT
    0.14
    za
    0.13
    Act Density 0.021%

    No Known Activations