INDEX
Explanations
references to body parts, specifically the nose
references to the word "nose" and its variations
New Auto-Interp
Negative Logits
20439
-0.78
ILCS
-0.76
FORM
-0.69
ERAL
-0.69
HCR
-0.68
PowerPoint
-0.67
Avg
-0.67
IGF
-0.66
Volunte
-0.66
Feder
-0.64
POSITIVE LOGITS
nose
1.27
noses
1.14
cone
1.02
Nose
0.87
palate
0.87
cones
0.86
bones
0.84
pier
0.83
ysis
0.82
cone
0.81
Activations Density 0.005%