INDEX
Explanations
terms related to the field of philosophy
references to philosophy and philosophical concepts
New Auto-Interp
Negative Logits
女
-0.82
è¦ļéĨĴ
-0.73
ding
-0.72
cue
-0.71
Operation
-0.68
ressing
-0.67
ells
-0.66
ookie
-0.66
enegger
-0.66
ilant
-0.64
POSITIVE LOGITS
ophical
1.46
ophers
1.11
ophy
1.05
philosophers
1.00
philosopher
0.98
opher
0.91
Philos
0.90
Aristotle
0.88
icist
0.88
Socrates
0.87
Activations Density 0.049%