INDEX
Explanations
questions or exclamations expressing surprise or confusion
questions expressing disbelief or confusion
New Auto-Interp
Negative Logits
Peninsula
-0.68
RC
-0.62
trop
-0.60
Discover
-0.58
BST
-0.56
Nord
-0.56
MAP
-0.55
Outdoor
-0.55
Interior
-0.55
correctly
-0.54
POSITIVE LOGITS
soever
1.34
else
0.87
happ
0.82
ever
0.81
happened
0.79
nces
0.78
Cause
0.76
happens
0.76
amaz
0.72
?!
0.72
Activations Density 0.122%