INDEX
Explanations
mentions of a specific location named Les
repeated mentions of the name "Les" or variations of it
New Auto-Interp
Negative Logits
ICE
-0.80
ACTED
-0.72
DERR
-0.71
ãĥ¼ãĥĨãĤ£
-0.71
YING
-0.68
ANI
-0.67
è¦ļéĨĴ
-0.66
ĻĤ
-0.66
ARK
-0.66
ãĥĥãĥĪ
-0.66
POSITIVE LOGITS
bians
1.13
bian
0.93
ukemia
0.81
bourg
0.81
nar
0.81
mut
0.80
leys
0.80
wana
0.78
agues
0.78
Les
0.78
Activations Density 0.012%