INDEX
Explanations
references to isolation and confinement
New Auto-Interp
Negative Logits
papel
-0.47
Slope
-0.42
chien
-0.41
Commencez
-0.41
тивной
-0.40
Slope
-0.40
ラク
-0.39
côtés
-0.39
优先
-0.38
stoj
-0.38
POSITIVE LOGITS
tightly
0.82
claust
0.81
Restriction
0.80
confines
0.79
restrictive
0.78
Strict
0.78
Efq
0.75
isolation
0.75
restriction
0.74
Restrictions
0.74
Activations Density 0.393%