INDEX
Explanations
research-related vocabulary
The neuron detects “result‐reporting” or “assertion” cues—i.e. verbs and phrases used to present findings or conclusions (e.g. demonstrate, indicate, suggest, it is known, allege).
New Auto-Interp
Negative Logits
gee
-0.07
,对
-0.07
Matt
-0.06
呈
-0.06
genocide
-0.06
firefighter
-0.06
дам
-0.06
филь
-0.06
лер
-0.06
anxiety
-0.06
POSITIVE LOGITS
besch
0.06
.Companion
0.06
puls
0.06
staged
0.06
pcap
0.06
'*
0.06
odox
0.06
로드
0.06
'../../../
0.06
qemu
0.06
Activations Density 0.125%