INDEX
Explanations
mentions of Australian state abbreviations and government-related terms
references to geographic locations, specifically in relation to New South Wales (NSW) and the Australian Capital Territory (ACT)
New Auto-Interp
Negative Logits
ãĥł
-0.80
bda
-0.80
toggle
-0.72
cean
-0.71
hower
-0.68
heimer
-0.68
akra
-0.67
aphael
-0.66
arters
-0.66
zai
-0.66
POSITIVE LOGITS
NSW
0.95
ADA
0.79
urst
0.78
UC
0.76
IRO
0.75
RL
0.75
CC
0.74
FW
0.74
DEF
0.71
Defence
0.71
Activations Density 0.008%