INDEX
Explanations
texts discussing fundamental concepts or ideas
references to fundamental concepts or issues
New Auto-Interp
Negative Logits
nery
-0.75
quer
-0.74
Volunteers
-0.73
peak
-0.72
oping
-0.72
kers
-0.70
pless
-0.70
haw
-0.70
crow
-0.69
annis
-0.69
POSITIVE LOGITS
tenets
0.96
principles
0.91
ists
0.83
importance
0.83
essence
0.81
izes
0.81
elements
0.81
fundamental
0.80
principle
0.80
flaw
0.80
Activations Density 0.014%