INDEX
Explanations
terms related to mathematical or logical structures
New Auto-Interp
Negative Logits
ness
-1.12
nya
-1.02
n
-0.82
د
-0.79
m
-0.71
נות
-0.68
ly
-0.68
p
-0.67
d
-0.67
lar
-0.64
POSITIVE LOGITS
SSI
1.39
CCI
1.38
MCI
1.37
CSI
1.23
GTI
1.23
STI
1.21
TSI
1.21
PPI
1.18
DPI
1.17
깐
1.16
Activations Density 0.703%