INDEX
Explanations
locations or place names
New Auto-Interp
Negative Logits
yip
-0.72
utterstock
-0.65
itiveness
-0.57
hill
-0.56
manship
-0.55
ibaba
-0.54
Kings
-0.52
spot
-0.52
shit
-0.52
VALUE
-0.51
POSITIVE LOGITS
ace
0.66
astern
0.65
incial
0.62
airo
0.61
adan
0.60
rane
0.60
emporary
0.58
antes
0.57
Norn
0.56
ãĥī
0.55
Activations Density 0.125%