INDEX
Explanations
the word 'naive'
terms describing naivety or gullibility
New Auto-Interp
Negative Logits
Downloadha
-0.83
ŃĶ
-0.79
foreseen
-0.75
ittee
-0.75
interrupted
-0.74
ngth
-0.73
alach
-0.72
hops
-0.71
avez
-0.69
orset
-0.68
POSITIVE LOGITS
naive
1.07
naïve
0.96
ïve
0.92
sters
0.86
glers
0.79
lings
0.76
ly
0.74
innocence
0.73
wd
0.71
ster
0.71
Activations Density 0.012%