INDEX
Explanations
references to society or societal structures
New Auto-Interp
Negative Logits
𝓸
-0.64
𝓪
-0.64
cur
-0.64
Prat
-0.61
kvar
-0.61
un
-0.61
mal
-0.61
GRAN
-0.60
ind
-0.60
ル
-0.59
POSITIVE LOGITS
society
2.43
Society
2.34
SOCIETY
2.30
Society
2.27
society
2.24
societies
2.23
Societies
2.16
sociedad
1.66
SOCI
1.60
Gesellschaft
1.55
Activations Density 0.046%