INDEX
Explanations
positive descriptors or attributes associated with experiences, people, or objects
New Auto-Interp
Negative Logits
ogens
-0.82
eligible
-0.72
arians
-0.72
Administ
-0.71
sic
-0.71
aer
-0.70
orig
-0.69
igate
-0.69
igating
-0.68
omics
-0.68
POSITIVE LOGITS
breeze
0.91
nice
0.85
little
0.85
additions
0.81
bonus
0.80
touches
0.80
touch
0.79
bye
0.79
neat
0.78
gesture
0.78
Activations Density 0.018%