INDEX
Explanations
statements or questions that indicate critical analysis or examination
New Auto-Interp
Negative Logits
anchester
-0.72
uly
-0.68
alian
-0.67
accustomed
-0.66
ctic
-0.65
dexter
-0.64
fal
-0.64
orts
-0.64
inki
-0.63
thal
-0.63
POSITIVE LOGITS
namely
1.40
Revenge
0.77
Capitalism
0.67
Return
0.63
AIDS
0.62
Marijuana
0.61
INS
0.61
Something
0.61
Catholicism
0.60
Lifetime
0.60
Activations Density 0.233%