INDEX
Explanations
terms related to governance and societal structures
phrases indicating significant differences or contrasts
New Auto-Interp
Negative Logits
Shots
-0.58
Happy
-0.56
Tours
-0.56
Tonight
-0.54
Hungry
-0.53
anwhile
-0.53
hung
-0.52
Saturdays
-0.52
Buzz
-0.52
Happy
-0.51
POSITIVE LOGITS
methodological
1.02
fallacy
0.93
presupp
0.90
heterogeneity
0.89
inconsistency
0.88
empirical
0.87
obfusc
0.86
overest
0.86
limitation
0.84
bias
0.83
Activations Density 0.812%