INDEX
    Explanations

    positive descriptions and compliments related to appearance or aesthetics

    starts with <start_of_turn> user

    New Auto-Interp
    Negative Logits
     ailes
    -0.44
    énieur
    -0.40
     doute
    -0.39
     luka
    -0.38
    Sink
    -0.37
     веб
    -0.36
     Duca
    -0.36
     ingeniero
    -0.36
     hereinafter
    -0.35
     Highness
    -0.35
    POSITIVE LOGITS
    verwijspagina
    0.57
    脚注の使い方
    0.54
    Personendaten
    0.53
     pinulongan
    0.52
    تفصیلات
    0.51
    期刊论文
    0.51
     surla
    0.48
     Савезне
    0.47
    thâu
    0.44
     تضيفلها
    0.44
    Act Density 0.003%

    No Known Activations