INDEX
Explanations
phrases related to countries or communities
references to societal structures and collective identities
New Auto-Interp
Negative Logits
âĿ
-0.76
bots
-0.72
mares
-0.71
Emails
-0.70
Ambro
-0.70
uden
-0.69
Joy
-0.67
Rus
-0.66
etz
-0.65
xus
-0.65
POSITIVE LOGITS
result
1.07
consequence
1.01
standalone
0.98
spectator
0.88
predictor
0.87
footballer
0.84
member
0.84
viable
0.83
cohesive
0.83
tool
0.83
Activations Density 0.097%