INDEX
Explanations
references to anything related to Britain or British identity
New Auto-Interp
Negative Logits
hog
-0.17
arna
-0.15
aling
-0.15
ceased
-0.15
ķìĿ¸
-0.15
اتÙĩ
-0.14
eyJ
-0.14
.osgi
-0.14
gro
-0.14
jit
-0.14
POSITIVE LOGITS
Isles
0.23
ness
0.22
teen
0.21
Colum
0.18
-American
0.18
Columbia
0.15
listed
0.14
ornado
0.14
geist
0.14
uang
0.14
Activations Density 0.023%