INDEX
Explanations
numerical values and indicators of measurement
New Auto-Interp
Negative Logits
ruz
-0.16
//{{-0.16
(£
-0.14
ÏģοÏħ
-0.14
ãĥ³ãĥĪ
-0.14
vd
-0.14
\common
-0.14
atha
-0.14
ừa
-0.14
pch
-0.14
POSITIVE LOGITS
tens
0.17
sim
0.16
ery
0.16
Ìĥ
0.15
hun
0.15
urt
0.15
ems
0.14
order
0.14
icans
0.14
ids
0.14
Activations Density 0.083%