INDEX
Explanations
keywords related to philosophical concepts or discussions
terminology related to philosophical concepts and discussions
New Auto-Interp
Negative Logits
esty
-0.80
rake
-0.72
redd
-0.72
ardless
-0.70
tower
-0.68
ording
-0.67
imgur
-0.67
lain
-0.66
sg
-0.66
ords
-0.66
POSITIVE LOGITS
curiosity
0.94
ophical
0.89
underpin
0.86
philosopher
0.86
philosophers
0.86
philosophical
0.83
prec
0.76
philosoph
0.74
theoret
0.74
dile
0.73
Activations Density 0.015%