INDEX
Explanations
names of places and cities
mentions of "San Francisco" and related abbreviations or variations
New Auto-Interp
Negative Logits
favors
-0.64
glers
-0.62
offset
-0.61
deaf
-0.60
irregularities
-0.59
favor
-0.59
ilater
-0.58
knitting
-0.58
Starr
-0.57
stitch
-0.57
POSITIVE LOGITS
士
0.86
eus
0.71
emy
0.70
atra
0.69
thia
0.68
Gas
0.67
ctic
0.67
escal
0.67
encia
0.66
itary
0.65
Activations Density 0.074%