INDEX
Explanations
details and descriptions related to political figures and statistics
New Auto-Interp
Negative Logits
ãĤ©
-0.72
ucci
-0.67
âĵĺ
-0.67
nels
-0.66
ometimes
-0.63
©¶æ
-0.62
Äĩ
-0.61
ãĥ£
-0.60
eless
-0.59
uliffe
-0.59
POSITIVE LOGITS
20
0.81
04
0.76
18
0.75
17
0.75
06
0.75
1080
0.73
rounder
0.72
angled
0.72
08
0.71
century
0.71
Activations Density 0.129%