INDEX
Explanations
words related to cooperation and collaboration in a professional context
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.17
1.0%
2019
+0.06
0.4%
1618
+0.05
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1618
+0.17
0.06
82
+0.06
0.06
276
+0.05
0.06
Negative Logits
<bos>
-2.17
ⓧ
-1.19
/**
-1.13
-1.07
<?
-0.99
/*
-0.99
<?
-0.98
///**
-0.93
/***
-0.90
//};
-0.81
POSITIVE LOGITS
maneu
1.82
affor
1.75
impra
1.66
reluct
1.65
increa
1.64
emphat
1.61
accla
1.57
inev
1.56
disagre
1.55
unden
1.52
Activations Density 0.268%