INDEX
Explanations
mentions of Britain or its related terms
New Auto-Interp
Negative Logits
orgh
-0.18
ephy
-0.17
adero
-0.16
itu
-0.16
ody
-0.16
ibold
-0.14
AGR
-0.14
amon
-0.14
ALAR
-0.14
YRO
-0.14
POSITIVE LOGITS
Isles
0.29
Columbia
0.27
ness
0.22
shire
0.20
raj
0.19
Virgin
0.18
ton
0.18
Colum
0.18
Bulld
0.17
RAIN
0.17
Activations Density 0.025%