INDEX
Explanations
references to scientific experiments
occurrences of the word "experiment."
New Auto-Interp
Negative Logits
othy
-0.70
headers
-0.67
doms
-0.66
CHAPTER
-0.65
nergy
-0.63
ACTED
-0.62
cut
-0.60
clinton
-0.60
vae
-0.59
miah
-0.59
POSITIVE LOGITS
imental
1.03
iments
0.99
iment
0.91
experiment
0.91
Experiment
0.89
ally
0.84
ually
0.79
eers
0.78
experimenting
0.77
experimented
0.74
Activations Density 0.016%