INDEX
    Explanations

    specialized adjectives describing specific concepts

    New Auto-Interp
    Negative Logits
     अज
    0.38
     historic
    0.36
    ENES
    0.35
    0.35
    血压
    0.34
     Condiciones
    0.34
    icol
    0.34
     歴史
    0.34
     Beyer
    0.34
    straight
    0.34
    POSITIVE LOGITS
     filha
    0.45
    носить
    0.42
     barbarian
    0.42
     تشکیل
    0.42
     inductive
    0.41
     kinder
    0.41
    طيكم
    0.41
    ocking
    0.41
     spaw
    0.41
     Ted
    0.40
    Act Density 0.003%

    No Known Activations