INDEX
Explanations
percentages or portions in text that are related to numbers
references to percentages
New Auto-Interp
Negative Logits
chn
-0.71
HCR
-0.69
IDES
-0.68
pload
-0.67
ICO
-0.67
Attributes
-0.65
gotten
-0.63
ãĤ¬
-0.63
ITED
-0.63
Rav
-0.62
POSITIVE LOGITS
imet
1.17
imeter
0.89
rowth
0.86
oning
0.84
lein
0.76
eele
0.76
ril
0.75
enaries
0.73
urion
0.71
ois
0.71
Activations Density 0.015%