INDEX
Explanations
statements about scientific validity and the interpretation of research findings
New Auto-Interp
Negative Logits
illet
-0.69
gays
-0.67
mirac
-0.66
Duchess
-0.65
Purg
-0.64
homosexuals
-0.63
cure
-0.63
washed
-0.62
cav
-0.62
magically
-0.62
POSITIVE LOGITS
anecdotal
1.07
unlikely
1.03
anecd
1.00
situational
0.99
certainly
0.93
speculative
0.92
typically
0.87
observational
0.86
likely
0.85
uncertainties
0.84
Activations Density 3.684%