INDEX
Explanations
references to societal concepts and institutions
New Auto-Interp
Negative Logits
ル
-0.67
Prin
-0.65
cur
-0.64
tral
-0.63
ok
-0.62
ال
-0.60
𝓪
-0.59
рас
-0.59
р
-0.59
amal
-0.59
POSITIVE LOGITS
Society
2.02
Societies
2.00
SOCIETY
1.99
societies
1.97
society
1.95
Society
1.94
society
1.86
sociedad
1.35
Gesellschaft
1.28
sociedade
1.25
Activations Density 0.058%