INDEX
Explanations
names of individuals or entities with high activations for various terms
proper nouns and technical terms
New Auto-Interp
Negative Logits
PB
-0.70
DIRECT
-0.67
calendars
-0.66
DAY
-0.65
microphones
-0.65
Entered
-0.65
fitness
-0.65
ACTION
-0.65
IMAGES
-0.65
PRODUCT
-0.64
POSITIVE LOGITS
adish
1.00
opa
0.90
ek
0.88
ioch
0.87
eal
0.87
oad
0.87
inse
0.87
izoph
0.86
alm
0.84
obic
0.83
Activations Density 0.347%