INDEX
Explanations
references to France and its culture
New Auto-Interp
Negative Logits
jit
-0.21
ÅĻe
-0.17
LANG
-0.16
utterstock
-0.16
ìĬ¤íĬ¸
-0.15
ehler
-0.15
abbo
-0.15
orge
-0.15
ÅĻen
-0.15
ialis
-0.14
POSITIVE LOGITS
man
0.43
men
0.39
ies
0.33
town
0.31
ified
0.30
mans
0.30
woman
0.30
-speaking
0.28
spe
0.28
y
0.28
Activations Density 0.029%