INDEX
Explanations
phrases related to established norms or traditions
the concept of "conventional" across various contexts
New Auto-Interp
Negative Logits
gur
-0.75
olulu
-0.74
oning
-0.68
interrupted
-0.68
Wanted
-0.68
hov
-0.66
giving
-0.66
Allaah
-0.63
oÄŁ
-0.63
Janeiro
-0.62
POSITIVE LOGITS
wisdom
1.38
ization
0.95
izations
0.94
ties
0.88
Warfare
0.86
isation
0.83
ventional
0.83
Wisdom
0.81
ised
0.81
notions
0.81
Activations Density 0.033%