INDEX
Explanations
references to indigenous communities and their cultural practices
New Auto-Interp
Negative Logits
.escape
-0.17
arin
-0.16
港
-0.15
jiang
-0.14
æ±Ł
-0.14
Eta
-0.14
Libert
-0.13
æij©
-0.13
onds
-0.13
Jiang
-0.13
POSITIVE LOGITS
Native
0.78
Native
0.69
Indigenous
0.67
indigenous
0.63
native
0.62
.Native
0.55
native
0.52
/native
0.51
.native
0.50
Aboriginal
0.50
Activations Density 0.276%