INDEX
Explanations
questions ending in a question mark
questions or queries, especially those phrased as inquiries
New Auto-Interp
Negative Logits
marsh
-0.75
encount
-0.73
apan
-0.71
ank
-0.69
onding
-0.66
Nadu
-0.65
harness
-0.64
wilderness
-0.64
manif
-0.64
welf
-0.63
POSITIVE LOGITS
Nope
1.10
.?
0.97
����
0.96
Probably
0.91
utm
0.88
Nah
0.85
Yep
0.85
Huh
0.82
Didn
0.81
Absolutely
0.81
Activations Density 0.105%