INDEX
Explanations
references to China and its variations in the text
New Auto-Interp
Negative Logits
uſe
-0.80
uſed
-0.72
ſhould
-0.71
anſ
-0.69
muſt
-0.67
deſt
-0.67
cauſe
-0.66
ſee
-0.63
Anſ
-0.63
raiſ
-0.63
POSITIVE LOGITS
night
0.78
commented
0.75
Nights
0.73
China
0.72
China
0.69
nights
0.68
ensement
0.67
集
0.66
night
0.65
Shanghai
0.64
Activations Density 0.131%