INDEX
Explanations
proper nouns such as names of people or places
mentions of a specific name, "Clement," and its variations
New Auto-Interp
Negative Logits
KK
-0.76
Mp
-0.71
ulner
-0.71
Ô
-0.71
netflix
-0.70
awar
-0.70
History
-0.68
knit
-0.67
Vi
-0.67
KS
-0.67
POSITIVE LOGITS
enary
1.02
ially
0.87
Clement
0.87
ial
0.86
xual
0.86
lement
0.82
rina
0.77
opher
0.76
agall
0.75
ropy
0.74
Activations Density 0.017%