INDEX
Explanations
statements that emphasize truth or authenticity
New Auto-Interp
Negative Logits
apatalk
-0.59
vstack
-0.53
addington
-0.53
esity
-0.48
hornet
-0.47
nibus
-0.47
ubarak
-0.47
tisation
-0.46
aronder
-0.46
apad
-0.46
POSITIVE LOGITS
true
1.37
True
1.27
True
1.25
true
1.25
TRUE
1.23
TRUE
1.14
verdadero
1.00
truest
0.98
truth
0.92
Verdad
0.92
Activations Density 0.162%