INDEX
Explanations
questions phrased in an inquisitive format
New Auto-Interp
Negative Logits
Plat
-0.15
plat
-0.15
ayd
-0.14
pped
-0.14
oup
-0.14
äs
-0.14
Gard
-0.14
annels
-0.14
rium
-0.14
irsch
-0.14
POSITIVE LOGITS
esson
0.15
NonQuery
0.15
ajar
0.15
echa
0.15
kı
0.14
ãĤ¿ãĥ«
0.14
olson
0.14
klass
0.14
.Blocks
0.14
926
0.14
Activations Density 0.065%