INDEX
Explanations
assertions of truthfulness or validity
New Auto-Interp
Negative Logits
Rhymes
-0.61
Manchuria
-0.54
yntaxException
-0.54
etera
-0.53
Denna
-0.52
Epistle
-0.52
AutoScaleMode
-0.52
Sides
-0.51
AppBundle
-0.51
Cabello
-0.51
POSITIVE LOGITS
False
0.77
Tru
0.77
True
0.76
(!__
0.75
False
0.74
TRUE
0.70
believers
0.70
isTrue
0.70
TRUE
0.68
uyler
0.67
Activations Density 0.101%