INDEX
Explanations
details related to everyday life situations and news
New Auto-Interp
Negative Logits
ç¥ŀ
-0.67
wealth
-0.62
apons
-0.61
subsistence
-0.61
grants
-0.60
ãĥĪ
-0.60
resists
-0.59
emulate
-0.58
aves
-0.58
©¶æ
-0.57
POSITIVE LOGITS
estern
0.83
yy
0.79
goodbye
0.76
alright
0.72
coincidence
0.72
raining
0.71
awhile
0.71
till
0.67
downhill
0.66
ECA
0.66
Activations Density 0.142%