INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     accused
    0.53
     characterised
    0.45
     threshold
    0.45
     heifer
    0.44
     accusing
    0.42
     alder
    0.42
     lounge
    0.42
     boyfriend
    0.42
     guilty
    0.42
     Guild
    0.41
    POSITIVE LOGITS
     глоба
    0.46
    Aplic
    0.45
     аспек
    0.43
     Сасик
    0.43
    aufgaben
    0.43
    𒉡
    0.43
    bben
    0.43
    0.43
    timevals
    0.43
    Glob
    0.42
    Act Density 0.003%

    No Known Activations