INDEX
Explanations
phrases or words related to basic concepts or principles
references to fundamental concepts or necessities
New Auto-Interp
Negative Logits
romeda
-0.79
igham
-0.74
oping
-0.73
ibal
-0.70
Tycoon
-0.70
isSpecialOrderable
-0.67
encer
-0.67
Rodrigo
-0.66
prow
-0.65
estern
-0.64
POSITIVE LOGITS
necessities
1.10
tenets
1.04
basic
0.98
arithmetic
0.92
lly
0.89
principles
0.88
premise
0.81
gradient
0.80
outline
0.79
essential
0.78
Activations Density 0.022%