INDEX
Explanations
highly specific terms related to particular subjects or domains
words related to being dedicated or committed to a specific cause or subject
New Auto-Interp
Negative Logits
Karin
-0.74
ANS
-0.69
FORE
-0.67
Dunham
-0.66
rett
-0.65
NER
-0.64
Ms
-0.64
Whitney
-0.64
Schne
-0.61
Greenwald
-0.60
POSITIVE LOGITS
icating
0.91
ication
0.88
xual
0.85
icate
0.83
ilation
0.81
rontal
0.80
ications
0.79
aukee
0.79
ombat
0.79
icates
0.77
Activations Density 0.029%