INDEX
Explanations
words related to technical processes or concepts
terms related to specific roles, actions, and organizational structures
New Auto-Interp
Negative Logits
å§«
-0.63
éĹĺ
-0.59
Pengu
-0.52
landish
-0.51
Painter
-0.50
é¾įå
-0.50
emale
-0.50
Georg
-0.50
farious
-0.50
Moor
-0.48
POSITIVE LOGITS
matically
0.69
(£
0.53
SCP
0.52
ciation
0.52
fully
0.51
lishes
0.50
Megan
0.50
atu
0.49
lessly
0.49
ams
0.49
Activations Density 0.882%