INDEX
Explanations
references to the UK
references to a specific geographic location or cultural context
New Auto-Interp
Negative Logits
Reply
-0.64
Rollins
-0.60
Bland
-0.59
poppy
-0.57
contr
-0.56
prank
-0.56
trace
-0.56
sheet
-0.56
otle
-0.55
Danger
-0.54
POSITIVE LOGITS
ulkan
1.59
hov
1.46
umar
1.20
htar
1.16
wu
1.16
ovych
1.10
nown
1.08
uyomi
1.06
hari
1.06
ileaks
1.06
Activations Density 0.034%