INDEX
Explanations
proper nouns, particularly names of individuals
words related to identification or classification of individuals or entities
New Auto-Interp
Negative Logits
Ow
-0.72
natureconservancy
-0.70
Gauntlet
-0.69
808
-0.69
treadmill
-0.68
typew
-0.67
TABLE
-0.66
Wat
-0.65
knob
-0.65
socket
-0.65
POSITIVE LOGITS
an
1.55
AN
1.41
ans
1.34
ano
1.27
ani
1.21
anic
1.20
anism
1.19
anian
1.17
ane
1.17
acan
1.17
Activations Density 0.179%