INDEX
Explanations
words related to threats, warnings, and concerns
terms associated with political and security concerns
New Auto-Interp
Negative Logits
oother
-0.79
Merge
-0.78
Transfer
-0.78
Plex
-0.77
awarding
-0.76
graft
-0.76
glomer
-0.75
assisted
-0.74
fficiency
-0.73
transfer
-0.72
POSITIVE LOGITS
fears
1.69
warnings
1.62
alarm
1.59
fear
1.59
threat
1.56
warning
1.55
panic
1.54
threats
1.54
worry
1.51
scares
1.51
Activations Density 0.702%