INDEX
Explanations
phrases related to body parts and medical conditions
the presence of conjunctions and connective phrases
New Auto-Interp
Negative Logits
enhagen
-0.78
TPS
-0.73
Brend
-0.73
Brain
-0.72
looph
-0.70
INTON
-0.69
Accountability
-0.68
Task
-0.68
Armed
-0.67
Subcommittee
-0.66
POSITIVE LOGITS
sic
1.12
ensis
0.82
pron
0.82
barley
0.75
cloth
0.74
marg
0.73
literally
0.73
recl
0.73
rice
0.72
leather
0.71
Activations Density 0.530%