INDEX
Explanations
references to philosophy and philosophical concepts
New Auto-Interp
Negative Logits
Vinci
-0.18
asa
-0.18
een
-0.16
oller
-0.16
asso
-0.16
еÑĨ
-0.16
entai
-0.15
emp
-0.15
eenth
-0.15
pulse
-0.15
POSITIVE LOGITS
osoph
0.19
ically
0.17
cial
0.16
ibe
0.16
ical
0.16
/art
0.15
ippi
0.15
á»ģn
0.15
Hue
0.14
cope
0.14
Activations Density 0.020%