INDEX
Explanations
references to locations, especially in or related to New York
the letter "Y" in various contexts
New Auto-Interp
Negative Logits
ãĥ¼ãĥĨãĤ£
-0.67
ãĥ¯
-0.65
ĸļ
-0.65
ãĥķãĤ©
-0.65
raints
-0.64
facilitated
-0.64
foil
-0.64
inelli
-0.62
cientious
-0.62
idity
-0.62
POSITIVE LOGITS
STEM
1.01
ORK
1.01
ARD
0.97
BI
0.94
RS
0.93
ANK
0.93
ANG
0.92
ERE
0.91
Y
0.90
UGE
0.90
Activations Density 0.016%