INDEX
Explanations
linguistic elements and structural patterns within varied language contexts
New Auto-Interp
Negative Logits
Physical
-0.16
ιÏİ
-0.16
ка
-0.15
physical
-0.15
Physical
-0.14
otics
-0.14
DirectoryName
-0.14
oro
-0.14
physical
-0.14
TJ
-0.14
POSITIVE LOGITS
_Lean
0.15
essim
0.14
/ext
0.14
plac
0.14
OLON
0.13
Americ
0.13
_placement
0.13
uchi
0.13
rico
0.13
emoth
0.13
Activations Density 0.081%