INDEX
Explanations
phrases related to the making or creation of things
gerund forms of verbs suggesting active processes or actions
New Auto-Interp
Negative Logits
istration
-0.88
eded
-0.77
è£
-0.73
к
-0.71
imposed
-0.70
umbn
-0.70
оÐ
-0.70
pent
-0.69
ÑĢ
-0.68
ulz
-0.68
POSITIVE LOGITS
Ones
1.02
Them
1.00
Gets
0.98
Comes
0.97
Makes
0.97
Difference
0.96
Advice
0.95
Wrong
0.95
Names
0.95
Together
0.95
Activations Density 0.168%