INDEX
Explanations
instances of textual data related to plans, arrangements, or hypothetical situations, typically involving discussions about future events or possibilities
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.25
1.2%
1909
+0.15
0.7%
662
+0.09
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1244
+0.25
0.03
1909
+0.15
0.03
468
+0.09
0.02
Negative Logits
<bos>
-2.95
intersper
-1.06
<?
-0.78
endow
-0.76
snoopy
-0.75
shenan
-0.75
/**
-0.74
yoda
-0.73
gild
-0.73
ⓧ
-0.72
POSITIVE LOGITS
monaster
0.85
gubern
0.81
marea
0.78
coû
0.76
maig
0.72
ideolog
0.72
utop
0.71
conflic
0.71
patata
0.71
kosme
0.70
Activations Density 0.209%