INDEX
Explanations
adjectives expressing judgment or evaluation
descriptors of authenticity and perception
New Auto-Interp
Negative Logits
rower
-0.74
ulhu
-0.71
deen
-0.70
zar
-0.68
asus
-0.67
endment
-0.63
utenant
-0.63
¬¼
-0.63
gow
-0.63
ynthesis
-0.62
POSITIVE LOGITS
ones
1.56
creations
1.05
inventions
0.98
reminders
0.96
versions
0.95
equivalents
0.93
originals
0.93
replacements
0.93
additions
0.93
respectively
0.92
Activations Density 0.337%