INDEX
Explanations
phrases containing the word "truth"
references to the character or concept of "truth"
New Auto-Interp
Negative Logits
cells
-0.76
Inferno
-0.67
Wolves
-0.66
Stra
-0.64
Chips
-0.64
Protective
-0.63
zzi
-0.62
Heritage
-0.62
Nose
-0.59
Lens
-0.59
POSITIVE LOGITS
uth
1.38
reys
1.18
atsu
1.07
anasia
0.95
osate
0.95
ouse
0.93
guiActiveUn
0.88
ayer
0.87
UTH
0.83
onse
0.83
Activations Density 0.004%