INDEX
Explanations
references to the city of Toronto
mentions of the city of Toronto
New Auto-Interp
Negative Logits
mble
-0.82
hardt
-0.81
bler
-0.77
vier
-0.74
ktop
-0.73
hard
-0.72
Verb
-0.71
gins
-0.71
sole
-0.70
icio
-0.69
POSITIVE LOGITS
Argon
0.94
Raptors
0.93
Toronto
0.87
Harbour
0.84
Centre
0.81
Citiz
0.80
Heights
0.79
Maple
0.78
Mississ
0.77
onto
0.75
Activations Density 0.013%