INDEX
Explanations
phrases that assert or emphasize the concept of truth
stating truth
New Auto-Interp
Negative Logits
sonrası
-0.51
harapkan
-0.47
abgesch
-0.45
waarbij
-0.42
inaugural
-0.42
årene
-0.41
katanya
-0.41
fuese
-0.41
出版年
-0.40
ayos
-0.40
POSITIVE LOGITS
truth
1.24
truth
1.03
TRUTH
0.97
reality
0.90
Wahrheit
0.89
Truth
0.88
Truth
0.85
vérité
0.82
reality
0.82
真相
0.72
Activations Density 0.010%