INDEX
Explanations
phrases related to social or political conflicts and tensions
New Auto-Interp
Negative Logits
Palest
-0.66
inval
-0.66
analysed
-0.61
calculating
-0.58
elig
-0.58
portraying
-0.58
aca
-0.57
describ
-0.57
secondly
-0.57
recommending
-0.57
POSITIVE LOGITS
ptive
0.74
izens
0.67
project
0.65
drift
0.62
eatures
0.61
icular
0.61
ptions
0.59
sponge
0.58
counselor
0.57
OUS
0.57
Activations Density 0.606%