INDEX
Explanations
mentions of the word "science" in various contexts
New Auto-Interp
Negative Logits
га
-0.17
edd
-0.17
tra
-0.17
steller
-0.16
ly
-0.16
rone
-0.16
udy
-0.16
Sciences
-0.16
undry
-0.15
ÑĮÑİÑĤ
-0.15
POSITIVE LOGITS
fiction
0.33
-fiction
0.31
Fiction
0.28
fiction
0.23
fictional
0.21
-policy
0.19
fict
0.18
/math
0.18
fair
0.16
communic
0.15
Activations Density 0.020%