INDEX
Explanations
words related to celebrities and famous individuals
names of characters and places in fictional contexts
New Auto-Interp
Negative Logits
Calder
-0.58
herald
-0.58
Bett
-0.57
logging
-0.56
started
-0.56
endowed
-0.56
Clover
-0.55
imperative
-0.55
Pell
-0.55
sterling
-0.55
POSITIVE LOGITS
pta
1.06
oku
0.86
ecast
0.85
atchewan
0.82
ivas
0.81
atoon
0.80
oslav
0.77
anka
0.76
inx
0.76
omore
0.74
Activations Density 0.117%