INDEX
Explanations
conditional phrases and inquiries about beliefs or opinions
New Auto-Interp
Negative Logits
ãĤ´ãĥ³
-0.92
igslist
-0.91
âķIJâķIJ
-0.84
auri
-0.81
guiActiveUn
-0.81
Sense
-0.79
ãĥĦ
-0.77
ionic
-0.75
etheus
-0.75
Ģ
-0.73
POSITIVE LOGITS
anymore
0.81
outcome
0.72
anybody
0.71
lessly
0.69
YOUR
0.66
YOU
0.66
Rosenstein
0.65
Freddie
0.64
DERR
0.64
ya
0.63
Activations Density 0.063%