INDEX
    Explanations

    the word 'naive'

    terms describing naivety or gullibility

    New Auto-Interp
    Negative Logits
    Downloadha
    -0.83
    ŃĶ
    -0.79
    foreseen
    -0.75
    ittee
    -0.75
    interrupted
    -0.74
    ngth
    -0.73
    alach
    -0.72
    hops
    -0.71
    avez
    -0.69
    orset
    -0.68
    POSITIVE LOGITS
     naive
    1.07
     naïve
    0.96
    ïve
    0.92
    sters
    0.86
    glers
    0.79
    lings
    0.76
    ly
    0.74
     innocence
    0.73
    wd
    0.71
    ster
    0.71
    Act Density 0.012%

    No Known Activations