INDEX
Explanations
positive affirmations and expressions of approval
New Auto-Interp
Negative Logits
795
-0.17
ogui
-0.16
uchi
-0.15
anna
-0.15
iras
-0.15
inning
-0.14
rowning
-0.14
575
-0.14
zeÅĦ
-0.14
amen
-0.14
POSITIVE LOGITS
idea
0.21
job
0.19
choice
0.18
timing
0.17
luck
0.17
Job
0.16
Idea
0.16
çĵ¦
0.16
stuff
0.16
JOB
0.16
Activations Density 0.050%