INDEX
Explanations
questions asking for specific information
questions or phrases inquiring about quantities or measurements
New Auto-Interp
Negative Logits
usa
-0.70
ubs
-0.70
gin
-0.69
POST
-0.65
angelo
-0.63
hari
-0.63
aughed
-0.62
article
-0.62
ICA
-0.62
idia
-0.62
POSITIVE LOGITS
much
1.36
far
1.12
many
1.08
badly
1.04
much
1.03
long
1.01
accurate
1.01
MUCH
0.97
Much
0.95
efficiently
0.95
Activations Density 0.081%