INDEX
Explanations
instances of the word "Rou" and its variations, indicating a focus on mentions of a specific location
New Auto-Interp
Negative Logits
Ñĥ
-0.17
rade
-0.16
idge
-0.15
RI
-0.15
nels
-0.15
rr
-0.14
rium
-0.14
Ãłn
-0.14
ases
-0.14
oods
-0.14
POSITIVE LOGITS
illard
0.21
illet
0.21
ille
0.19
thern
0.18
theast
0.18
illon
0.17
cou
0.17
lt
0.17
e
0.17
ette
0.17
Activations Density 0.030%