INDEX
Explanations
references to science
mentions of "science" in various contexts
New Auto-Interp
Negative Logits
STATES
-0.66
oser
-0.63
terior
-0.62
raine
-0.62
ESH
-0.61
leased
-0.61
Hearts
-0.61
âĢ¢âĢ¢
-0.60
Seasons
-0.60
Kessler
-0.60
POSITIVE LOGITS
fiction
1.25
Fiction
1.21
icist
0.98
fiction
0.95
craft
0.91
literacy
0.87
istries
0.80
mong
0.80
sonian
0.79
bench
0.77
Activations Density 0.049%