INDEX
Explanations
names of cities and places
New Auto-Interp
Negative Logits
kefeller
-0.77
ufact
-0.68
âĸ¬
-0.68
illeg
-0.67
cess
-0.66
theless
-0.65
ificial
-0.64
Argent
-0.62
plane
-0.61
Qiao
-0.61
POSITIVE LOGITS
atsu
0.91
nen
0.91
uku
0.91
ikuman
0.86
uchi
0.85
hai
0.81
ovi
0.78
aki
0.77
otos
0.77
ui
0.76
Activations Density 0.129%