INDEX
Explanations
assertions about truth and validity in statements
New Auto-Interp
Negative Logits
SharedDtor
-0.67
DockStyle
-0.67
>>>>>>>
-0.63
-0.63
ftagPool
-0.62
InjectAttribute
-0.61
љи
-0.54
<bos>
-0.53
MigrationBuilder
-0.52
geslacht
-0.52
POSITIVE LOGITS
true
2.81
true
2.46
True
2.20
True
2.19
TRUE
2.09
TRUE
1.91
truth
1.78
truer
1.66
vrai
1.56
truest
1.49
Activations Density 0.409%