INDEX
Explanations
place names, especially those that may be difficult to pronounce or are related to news events
references to specific locations and names
New Auto-Interp
Negative Logits
acy
-0.80
ory
-0.71
uality
-0.69
acea
-0.66
loo
-0.66
angers
-0.62
ORY
-0.60
calves
-0.60
saline
-0.60
opian
-0.60
POSITIVE LOGITS
eer
1.07
eering
0.97
lisher
0.95
pload
0.89
eh
0.87
eting
0.86
semb
0.84
fortable
0.84
ebus
0.83
lance
0.79
Activations Density 0.024%