INDEX
Explanations
proper nouns related to politics and individuals
consonant clusters or specific letter patterns in text
New Auto-Interp
Negative Logits
conclud
-0.69
skelet
-0.67
carbohyd
-0.64
ãĥ¼ãĥĨ
-0.63
CONCLUS
-0.61
encount
-0.61
BOOK
-0.61
conduc
-0.61
ACTIONS
-0.60
occas
-0.55
POSITIVE LOGITS
oda
0.78
anski
0.77
amer
0.76
oe
0.74
uk
0.72
nik
0.71
la
0.70
ilia
0.68
anders
0.68
onga
0.67
Activations Density 0.364%