INDEX
Explanations
the word "conventional" followed by further descriptions or contexts
references to conventional wisdom or established norms
New Auto-Interp
Negative Logits
gur
-0.79
oran
-0.77
oning
-0.77
hov
-0.74
olulu
-0.73
hop
-0.71
haw
-0.71
ander
-0.70
Wanted
-0.70
Janeiro
-0.70
POSITIVE LOGITS
wisdom
1.18
izations
0.85
conventional
0.83
ties
0.83
ization
0.83
ventional
0.79
idad
0.78
arily
0.74
ities
0.74
ised
0.74
Activations Density 0.016%