INDEX
Explanations
phrases related to discussions or considerations around sensitive or controversial topics
New Auto-Interp
Negative Logits
ij士
-0.73
court
-0.69
teness
-0.68
txt
-0.63
Kissinger
-0.63
Boat
-0.62
none
-0.62
OTOS
-0.61
Hicks
-0.61
clave
-0.60
POSITIVE LOGITS
interact
0.96
interpret
0.95
behave
0.94
handle
0.92
cope
0.92
communicate
0.90
interpreting
0.90
coping
0.89
navigate
0.88
relate
0.85
Activations Density 1.226%