INDEX
Explanations
the word "haven" in different contexts or forms such as "haven't"
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
478
+0.14
0.4%
874
+0.13
0.4%
1335
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1335
+0.14
0.03
1557
+0.13
0.02
874
+0.12
0.03
Negative Logits
Cuer
-0.52
Justo
-0.49
granada
-0.45
Leal
-0.44
Robb
-0.43
Hvad
-0.43
Dermott
-0.43
Rosales
-0.43
Dowell
-0.42
weebly
-0.42
POSITIVE LOGITS
haven
0.84
îna
0.82
haven
0.77
intitulée
0.74
Haven
0.74
Haven
0.73
:,,
0.73
;;)
0.72
.-"
0.71
sarili
0.70
Activations Density 0.039%