INDEX
Explanations
mentions of specific locations or states
references to specific geographic locations and the concept of home or local identity
New Auto-Interp
Negative Logits
lycer
-0.80
tsky
-0.80
attm
-0.77
ngth
-0.75
illon
-0.74
itone
-0.73
oshenko
-0.73
ende
-0.72
ï¸ı
-0.71
Äĩ
-0.71
POSITIVE LOGITS
enment
0.70
snack
0.67
Gin
0.64
zoo
0.62
picnic
0.62
Babe
0.61
Warriors
0.61
smack
0.61
horrors
0.60
Goose
0.60
Activations Density 0.140%