INDEX
Explanations
references to the term "citizen."
New Auto-Interp
Negative Logits
hiba
-0.86
ension
-0.76
hift
-0.74
creen
-0.73
ammers
-0.71
ensions
-0.71
mith
-0.71
heet
-0.71
ended
-0.69
mers
-0.69
POSITIVE LOGITS
ry
1.17
Citizen
0.85
hood
0.84
citizen
0.80
ially
0.78
fare
0.71
Kane
0.71
aer
0.68
uprising
0.68
ial
0.65
Activations Density 0.008%