INDEX
Explanations
phrases indicating access or entry restrictions
New Auto-Interp
Negative Logits
aval
-0.16
output
-0.15
inci
-0.14
Output
-0.14
outputs
-0.14
tach
-0.14
immel
-0.14
cÃŃ
-0.13
Emb
-0.13
æľĭ
-0.13
POSITIVE LOGITS
entry
0.62
enter
0.56
entering
0.55
enters
0.54
entrance
0.53
Entry
0.52
-entry
0.51
è¿Ľåħ¥
0.51
entry
0.49
enter
0.49
Activations Density 0.206%