INDEX
Explanations
names of people and places
New Auto-Interp
Negative Logits
Clover
-0.67
endowed
-0.65
sterling
-0.60
Calder
-0.55
Grimes
-0.55
silent
-0.55
pim
-0.54
glass
-0.54
Kessler
-0.53
maintaining
-0.53
POSITIVE LOGITS
pta
0.87
adic
0.81
atoon
0.80
oku
0.78
anka
0.77
oslav
0.77
hom
0.76
inx
0.75
iman
0.73
atchewan
0.72
Activations Density 0.058%