INDEX
Explanations
instances of the word "released" and its variations, indicating referrals to freedom or liberation
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.27
0.9%
1013
+0.10
0.3%
690
+0.07
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
284
+0.27
0.05
1013
+0.10
0.07
509
+0.07
0.06
Negative Logits
<bos>
-1.98
intersper
-1.08
-0.94
endow
-0.89
/***
-0.84
underval
-0.81
ⓧ
-0.78
endeavored
-0.76
<?
-0.76
enshr
-0.74
POSITIVE LOGITS
MLLoader
0.73
WaitForSeconds
0.69
hdas
0.67
requipa
0.66
°;
0.65
GoogleFonts
0.65
échal
0.65
Himo
0.64
ianuarie
0.64
montagna
0.63
Activations Density 0.594%