INDEX
Explanations
numerical representations of quantities or counts
New Auto-Interp
Negative Logits
oods
-0.17
инки
-0.16
urons
-0.16
apes
-0.15
flows
-0.15
rades
-0.15
uling
-0.15
ulas
-0.14
illions
-0.14
inton
-0.14
POSITIVE LOGITS
person
0.24
member
0.20
member
0.19
fold
0.19
person
0.19
-member
0.18
-person
0.18
:item
0.18
dimensional
0.17
Person
0.17
Activations Density 0.128%