INDEX
Explanations
words related to facilities or infrastructure
New Auto-Interp
Negative Logits
eper
-0.17
een
-0.15
ucken
-0.15
icamente
-0.15
ling
-0.15
oen
-0.15
ight
-0.15
ÑĢип
-0.15
á»ķi
-0.14
ear
-0.14
POSITIVE LOGITS
ilitation
0.27
fac
0.27
ilit
0.27
ult
0.27
ilities
0.25
ilitating
0.24
fac
0.23
ILITY
0.22
Fac
0.21
ILITIES
0.21
Activations Density 0.009%