INDEX
Explanations
phrases related to identity or classification
New Auto-Interp
Negative Logits
766
-0.15
Gardner
-0.15
Ä°ÅŁ
-0.14
SizeMode
-0.14
iesz
-0.14
uria
-0.13
алÑĮне
-0.13
ighthouse
-0.13
McDonald
-0.13
aru
-0.13
POSITIVE LOGITS
consts
0.16
enthal
0.15
ypse
0.15
itty
0.14
andi
0.14
hull
0.14
-mf
0.14
941
0.14
евиÑĩ
0.14
rance
0.13
Activations Density 0.064%