INDEX
Explanations
instances of the word "express" and its variants, indicating a focus on communication or formal declarations
New Auto-Interp
Negative Logits
vip
-0.16
wh
-0.15
acre
-0.15
awah
-0.15
uteur
-0.15
other
-0.15
aktu
-0.15
ίκ
-0.15
ètre
-0.15
exempt
-0.15
POSITIVE LOGITS
mond
0.15
iena
0.15
iegel
0.15
orado
0.14
ynos
0.14
ance
0.14
nest
0.14
ively
0.14
EDA
0.14
430
0.14
Activations Density 0.031%