INDEX
Explanations
names of famous individuals
proper nouns, particularly names and titles
New Auto-Interp
Negative Logits
etheless
-0.91
ULAR
-0.68
BILITY
-0.65
channelAvailability
-0.65
inarily
-0.64
malink
-0.63
rency
-0.61
itionally
-0.60
alyses
-0.60
RIPT
-0.60
POSITIVE LOGITS
Kardash
0.96
Klux
0.94
Oprah
0.82
Barbie
0.80
Kardashian
0.78
Coke
0.77
Beatles
0.76
infeld
0.75
opoly
0.74
Einstein
0.74
Activations Density 0.348%