INDEX
Explanations
details related to product safety instructions and warnings
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1352
+0.09
0.2%
1526
+0.08
0.2%
191
+0.07
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1788
+0.09
0.04
191
+0.08
0.04
1580
+0.07
0.04
Negative Logits
détaillé
-0.49
Hii
-0.49
intéressante
-0.49
FlatStyle
-0.48
inconnu
-0.47
.*")]
-0.47
récente
-0.46
ulihan
-0.46
woll
-0.46
dients
-0.45
POSITIVE LOGITS
🤣🤣
0.65
tupperware
0.64
cushi
0.62
IsContent
0.57
🥲
0.57
🔥🔥
0.56
tutt
0.55
Fuckin
0.55
Più
0.55
vece
0.53
Activations Density 0.386%