INDEX
Explanations
phrases related to alignment or agreement with certain standards or policies
phrases that express consistency or alignment with policies or principles
New Auto-Interp
Negative Logits
lot
-0.73
du
-0.70
gins
-0.69
zon
-0.68
close
-0.65
abouts
-0.64
challeng
-0.62
eware
-0.62
mouth
-0.62
enaries
-0.62
POSITIVE LOGITS
tradition
0.99
tenets
0.98
norms
0.98
ideals
0.97
principles
0.97
teachings
0.94
expectations
0.92
traditions
0.92
guidelines
0.92
reality
0.91
Activations Density 0.162%