INDEX
Explanations
references to the humanities and social sciences as fields of study
New Auto-Interp
Negative Logits
4
-0.15
8
-0.15
Todo
-0.14
uš
-0.13
ÑĥÑĪ
-0.13
3
-0.13
bounding
-0.13
edn
-0.13
CM
-0.13
symb
-0.13
POSITIVE LOGITS
sciences
0.39
Sciences
0.34
Exact
0.29
exact
0.29
natural
0.28
exact
0.28
Exact
0.27
humanities
0.27
STEM
0.26
STEM
0.26
Activations Density 0.111%