INDEX
Explanations
phrases related to scientific research and data analysis
instances of the word "This" indicating new ideas or concepts introduced in the text
New Auto-Interp
Negative Logits
ARS
-0.77
unk
-0.76
aws
-0.71
amia
-0.67
icons
-0.64
adle
-0.63
oller
-0.63
rums
-0.63
isms
-0.61
own
-0.61
POSITIVE LOGITS
latter
0.89
contrasts
0.86
article
0.83
arrang
0.83
culminated
0.82
discrepancy
0.81
particular
0.80
trope
0.80
week
0.79
phenomenon
0.78
Activations Density 0.163%