INDEX
Explanations
scientific or technical terms related to various fields such as medicine, data science, and economics
instances of the end-of-text token
New Auto-Interp
Negative Logits
20439
-0.71
Late
-0.60
aughs
-0.59
lled
-0.59
ungle
-0.58
CI
-0.56
soundtrack
-0.55
Sea
-0.55
habi
-0.55
endment
-0.54
POSITIVE LOGITS
historians
0.76
programmers
0.75
enthusiasts
0.73
specialists
0.73
experts
0.73
nerds
0.71
policymakers
0.71
researchers
0.70
ographers
0.69
regulators
0.69
Activations Density 0.431%