INDEX
Explanations
phrases related to being available for help or assistance
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1081
+0.08
0.2%
459
+0.08
0.2%
348
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1081
+0.08
0.03
1025
+0.08
0.03
1073
+0.08
0.03
Negative Logits
.*")]
-0.49
egreg
-0.46
Illus
-0.46
Keny
-0.46
Rodrig
-0.44
Mej
-0.43
Juf
-0.43
Breakout
-0.43
Rese
-0.43
montée
-0.42
POSITIVE LOGITS
Ottobre
0.67
quegli
0.64
distance
0.60
affitto
0.60
Luglio
0.58
vece
0.58
Giugno
0.58
partecipa
0.57
ajuns
0.57
chiunque
0.56
Activations Density 0.123%