INDEX
Explanations
terms related to foundational principles or beliefs
occurrences of the word "fundamental" and its context
New Auto-Interp
Negative Logits
Leone
-0.76
pload
-0.75
quer
-0.73
wick
-0.70
fing
-0.70
crow
-0.69
chief
-0.69
ensen
-0.69
spr
-0.68
annis
-0.68
POSITIVE LOGITS
ists
1.22
ist
1.11
ism
1.10
misunderstanding
1.02
tenets
0.99
principles
0.97
istic
0.95
izing
0.95
alteration
0.94
izes
0.94
Activations Density 0.032%