INDEX
Explanations
differences or disagreements in opinions or beliefs being expressed
New Auto-Interp
Negative Logits
wash
-0.81
Laksh
-0.74
guards
-0.73
oil
-0.71
Fior
-0.71
advertising
-0.69
Continue
-0.69
guard
-0.67
Cu
-0.66
interns
-0.65
POSITIVE LOGITS
terms
1.14
wording
1.02
principle
1.01
relation
0.99
regards
0.98
clusions
0.97
iple
0.96
temperament
0.93
diam
0.91
geography
0.90
Activations Density 1.285%