INDEX
Explanations
mentions of different groups of people or communities
references to different groups or communities of people
New Auto-Interp
Negative Logits
ctors
-0.71
VM
-0.70
ous
-0.68
QL
-0.67
LV
-0.67
é¾
-0.67
Shot
-0.65
Charge
-0.64
GPU
-0.64
PM
-0.64
POSITIVE LOGITS
peoples
1.28
oples
0.99
minds
0.83
nations
0.81
Peoples
0.77
inhab
0.77
wana
0.76
itaire
0.74
selves
0.73
folk
0.73
Activations Density 0.006%