INDEX
Explanations
words related to creation, implementation, or production processes
gerunds or present participles
New Auto-Interp
Negative Logits
Leilan
-0.74
Neg
-0.67
luaj
-0.65
å£
-0.65
zn
-0.64
iever
-0.63
ivari
-0.63
disarm
-0.61
herty
-0.60
netflix
-0.59
POSITIVE LOGITS
redients
1.19
tons
0.94
HAM
0.83
athering
0.72
Procedure
0.72
ales
0.69
ulate
0.68
GGGGGGGG
0.68
hots
0.64
rat
0.64
Activations Density 0.071%