INDEX
    Explanations

    phrases and descriptors that indicate typicality or common examples

    New Auto-Interp
    Negative Logits
     Forst
    -0.61
    BorderFactory
    -0.53
    ссер
    -0.53
    зда
    -0.52
    eventbus
    -0.49
     Horne
    -0.49
    openConnection
    -0.49
    airbnb
    -0.48
     Seif
    -0.48
     raffredd
    -0.47
    POSITIVE LOGITS
    Typical
    1.30
     typical
    1.27
    typical
    1.24
     Typical
    1.23
     typique
    1.04
    典型
    0.96
     típico
    0.90
     atypical
    0.90
     TYP
    0.88
     typically
    0.88
    Act Density 0.169%

    No Known Activations