INDEX
Explanations
statements about truth and reality
"truth" and its variants
New Auto-Interp
Negative Logits
kör
-0.55
তথ্যসূত্র
-0.53
Forest
-0.53
Ro
-0.49
Forest
-0.46
newBuilder
-0.46
K
-0.45
I
-0.44
U
-0.44
<eos>
-0.44
POSITIVE LOGITS
truths
1.14
Truths
1.10
TRUTH
1.06
Truth
1.06
truth
1.02
truth
0.97
Truth
0.95
Efq
0.90
цездатний
0.88
Verdad
0.86
Activations Density 0.194%